# Full text of "Introduction To Theory Of Computation"

## See other formats

SSVy , t gs£ - f;tt &$::at 'O && m^n 1 n-M leory oi ntroduction to • • »J • • • **«♦ »«•< &» $ **j*»#a?» * * *V5» *<*!%** M #A<!*9*y« *«***♦*« ** « # *•»**»*»♦»<» • Mi* • « * 9 , ;t»*****«*^##%»***«fvM MICHAEL SIPSER INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION MICHAEL SIPSER Massachusetts Institute of Technology THOMSON - <¥ - COURSE TECHNOLOGY Australia • Canada • Mexico • Singapore • Spain • United Kingdom • United States THOMSON ---*- COURSE TECHNOLOGY Introduction to the Theory of Computation, Second Edition by Michael Sipser Senior Product Manager: Alyssa Pratt Executive Editor: Mac Mendelsohn Associate Production Manager: Aimee Poirier Senior Marketing Manager: Karen Seitz COPYRIGHT © 2006 Thomson Course Technology, a division of Thomson Learning, Inc. Thomson Learning 1 M is a trademark used herein under license. Printed in the United States of America 1 2 3 4 5 6 7 8 9 QWT 09 08 07 06 05 For more information, contact Thomson Course Technology 2 5 Thomson Place Boston, Massachusetts, 02210. Or find us on the World Wide Web at: www.course.com ALL RIGHTS RESERVED. No part of this work covered by the copyright hereon may be reproduced or used in any form or by any means graphic, electronic, or mechanical, including photocopying, recording, taping, Web distribution, or information storage and retrieval systems without the written permission of the publisher. Associate Product Manager: Mirella Misiaszek Editorial Assistant: Jennifer Smith Senior Manufacturing Coordinator: Trevor Kallop Cover Designer: Steve Deschesne For permission to use material from this text or product, submit a request online at http://www.thomsonrights.com Any additional questions about permissions can be submitted by e-mail to thomsonrights@thomson.com Disclaimer Thomson Course Technology reserves the right to revise this publication and make changes from time to time in its content without notice. ISBN 0-534-95097-3 CONTENTS Preface to the First Edition xi To the student. xi To the educator. xii The first edition.xiii Feedback to the author.xiii Acknowledgments.xiv Preface to the Second Edition xvii 0 Introduction 1 0.1 Automata, Computability, and Complexity. 1 Complexity theory. 2 Computability theory. 2 Automata theory. 3 0.2 Mathematical Notions and Terminology. 3 Sets. 3 Sequences and tuples . 6 Functions and relations. 7 Graphs. 10 Strings and languages. 13 Boolean logic. 14 Summary of mathematical terms. 16 0.3 Definitions, Theorems, and Proofs. 17 Finding proofs. 17 0.4 Types of Proof. 21 Proof by construction. 21 Proof by contradiction. 21 Proof by induction. 22 Exercises, Problems, and Solutions . 25 v VI CONTENTS Part One: Automata and Languages 29 1 Regular Languages 31 1.1 Finite Automata . 31 Formal definition of a finite automaton. 35 Examples of finite automata. 37 Formal definition of computation . 40 Designing finite automata. 41 The regular operations. 44 1.2 Nondeterminism. 47 Formal definition of a nondeterministic finite automaton . ... 53 Equivalence of NFAs and DFAs . 54 Closure under the regular operations. 58 1.3 Regular Expressions. 63 Formal definition of a regular expression . 64 Equivalence with finite automata. 66 1.4 Nonregular Languages. 77 The pumping lemma for regular languages. 77 Exercises, Problems, and Solutions . 82 2 Context-Free Languages 99 2.1 Context-free Grammars.100 Formal definition of a context-free grammar .102 Examples of context-free grammars .103 Designing context-free grammars .104 Ambiguity.105 Chomsky normal form .106 2.2 Pushdown Automata.109 Formal definition of a pushdown automaton.Ill Examples of pushdown automata.112 Equivalence with context-free grammars.115 2.3 Non-context-free Languages.123 The pumping lemma for context-free languages.123 Exercises, Problems, and Solutions .128 Part Two: Computability Theory 135 3 The Church-Turing Thesis 137 3.1 Turing Machines.137 Formal definition of a Turing machine.139 Examples of Turing machines.142 3.2 Variants of Turing Machines.148 Multitape Turing machines.148 Nondeterministic Turing machines.150 Enumerators.152 CONTENTS VII Equivalence with other models.153 3.3 The Definition of Algorithm .154 Hilbert’s problems.154 Terminology for describing Turing machines.156 Exercises, Problems, and Solutions .159 4 Decidability 165 4.1 Decidable Languages.166 Decidable problems concerning regular languages.166 Decidable problems concerning context-free languages.170 4.2 The Halting Problem.173 The diagonalization method .174 The halting problem is undecidable.179 A Turing-unrecognizable language.181 Exercises, Problems, and Solutions .182 5 Reducibility 187 5.1 Undecidable Problems from Language Theory.188 Reductions via computation histories.192 5.2 A Simple Undecidable Problem.199 5.3 Mapping Reducibility.206 Computable functions.206 Formal definition of mapping reducibility.207 Exercises, Problems, and Solutions .211 6 Advanced Topics in Computability Theory 217 6.1 The Recursion Theorem.217 Self-reference .218 Terminology for the recursion theorem.221 Applications.222 6.2 Decidability of logical theories .224 A decidable theory.227 An undecidable theory.229 6.3 Turing Reducibility.232 6.4 A Definition of Information.233 Minimal length descriptions .234 Optimality of the definition.238 Incompressible strings and randomness.239 Exercises, Problems, and Solutions .242 Part Three: Complexity Theory 245 7 Time Complexity 247 7.1 Measuring Complexity.247 Big-0 and small-o notation.248 Vili CONTENTS Analyzing algorithms.251 Complexity relationships among models.254 7.2 The Class P.256 Polynomial time.256 Examples of problems in P.258 7.3 The Class NP.264 Examples of problems in NP.267 The P versus NP question .269 7.4 NP-completeness.271 Polynomial time reducibility.272 Definition of NP-completeness.276 The Cook-Levin Theorem.276 7.5 Additional NP-complete Problems.283 The vertex cover problem.284 The Hamiltonian path problem .286 The subset sum problem.291 Exercises, Problems, and Solutions .294 8 Space Complexity 303 8.1 Savitch’s Theorem.305 8.2 The Class PSPACE .308 8.3 PSPACE-completeness .309 The TQBF problem.310 Winning strategies for games.313 Generalized geography.315 8.4 The Classes L and NL.320 8.5 NL-completeness .323 Searching in graphs.325 8.6 NL equals coNL.326 Exercises, Problems, and Solutions .328 9 Intractability 335 9.1 Hierarchy Theorems.336 Exponential space completeness .343 9.2 Relativization.348 Limits of the diagonalization method .349 9.3 Circuit Complexity.351 Exercises, Problems, and Solutions .360 10 Advanced topics in complexity theory 365 10.1 Approximation Algorithms .365 10.2 Probabilistic Algorithms.368 The class BPP.368 Primality.371 Read-once branching programs.376 10.3 Alternation.380 CONTENTS IX Alternating time and space.381 The Polynomial time hierarchy.386 10.4 Interactive Proof Systems.387 Graph nonisomorphism.387 Definition of the model.388 IP = PSPACE .390 10.5 Parallel Computation .399 Uniform Boolean circuits.400 The class NC .402 P-completeness .404 10.6 Cryptography.405 Secret keys.405 Public-key cryptosystems.407 One-way functions.407 Trapdoor functions .409 Exercises, Problems, and Solutions .411 Selected Bibliography 415 Index 421 PREFACE TO THE FIRST EDITION TO THE STUDENT Welcome! You are about to embark on the study of a fascinating and important subject: the theory of computation. It comprises the fundamental mathematical proper¬ ties of computer hardware, software, and certain applications thereof. In study¬ ing this subject we seek to determine what can and cannot be computed, how quickly, with how much memory, and on which type of computational model. The subject has obvious connections with engineering practice, and, as in many sciences, it also has purely philosophical aspects. I know that many of you are looking forward to studying this material but some may not be here out of choice. You may want to obtain a degree in com¬ puter science or engineering, and a course in theory is required—God knows why. After all, isn’t theory arcane, boring, and worst of all, irrelevant? To see that theory is neither arcane nor boring, but instead quite understand¬ able and even interesting, read on. Theoretical computer science does have many fascinating big ideas, but it also has many small and sometimes dull details that can be tiresome. Learning any new subject is hard work, but it becomes easier and more enjoyable if the subject is properly presented. My primary ob¬ jective in writing this book is to expose you to the genuinely exciting aspects of computer theory, without getting bogged down in the drudgery. Of course, the only way to determine whether theory interests you is to try learning it. xi Xil PREFACE TO THE FIRST EDITION Theory is relevant to practice. It provides conceptual tools that practition¬ ers use in computer engineering. Designing a new programming language for a specialized application? What you learned about grammars in this course comes in handy. Dealing with string searching and pattern matching? Remember finite automata and regular expressions. Confronted with a problem that seems to re¬ quire more computer time than you can afford? Think back to what you learned about NP-cmipleteness. Various application areas, such as modern cryptographic protocols, rely on theoretical principles that you will learn here. Theory also is relevant to you because it shows you a new, simpler, and more elegant side of computers, which we normally consider to be complicated ma¬ chines. The best computer designs and applications are conceived with elegance in mind. A theoretical course can heighten your aesthetic sense and help you build more beautiful systems. Finally, theory is good for you because studying it expands your mind. Com¬ puter technology changes quickly. Specific technical knowledge, though useful today, becomes outdated in just a few years. Consider instead the abilities to think, to express yourself clearly and precisely, to solve problems, and to know when you haven’t solved a problem. These abilities have lasting value. Studying theory trains you in these areas. Practical considerations aside, nearly everyone working with computers is cu¬ rious about these amazing creations, their capabilities, and their limitations. A whole new branch of mathematics has grown up in the past 30 years to answer certain basic questions. Here’s a big one that remains unsolved: If I give you a large number, say, with 500 digits, can you find its factors (the numbers that di¬ vide it evenly), in a reasonable amount of time? Even using a supercomputer, no one presendy knows how to do that in all cases within the lifetime of the universe! The factoring problem is connected to certain secret codes in modern cryptosys¬ tems. Find a fast way to factor and fame is yours! TO THE EDUCATOR This book is intended as an upper-level undergraduate or introductory gradu¬ ate text in computer science theory. It contains a mathematical treatment of the subject, designed around theorems and proofs. I have made some effort to accommodate students with little prior experience in proving theorems, though more experienced students will have an easier time. My primary goal in presenting the material has been to make it clear and interesting. In so doing, I have emphasized intuition and “the big picture” in the subject over some lower level details. For example, even though I present the method of proof by induction in Chapter 0 along with other mathematical preliminaries, it doesn’t play an im¬ portant role subsequently. Generally I do not present the usual induction proofs of the correctness of various constructions concerning automata. If presented clearly, these constructions convince and do not need further argument. An in¬ duction may confuse rather than enlighten because induction itself is a rather sophisticated technique that many find mysterious. Belaboring the obvious with PREFACE TO THE FIRST EDITION Xlii •an induction risks teaching students that mathematical proof is a formal manip¬ ulation instead of teaching them what is and what is not a cogent argument. A second example occurs in Parts Two and Three, where I describe algorithms in prose instead of pseudocode. I don’t spend much time programming Turing machines (or any other formal model). Students today come with a program¬ ming background and find the Church-Turing thesis to be self-evident. Hence I don’t present lengthy simulations of one model by another to establish their equivalence. Besides giving extra intuition and suppressing some details, I give what might be called a classical presentation of the subject material. Most theorists will find the choice of material, terminology, and order of presentation consistent with that of other widely used textbooks. I have introduced original terminology in only a few places, when I found the standard terminology particularly obscure or confusing. For example I introduce the term mapping reducibility instead of many-one reducibility. Practice through solving problems is essential to learning any mathemati¬ cal subject. In this book, the problems are organized into two main categories called Exercises and Problems. The Exercises review definitions and concepts. The Problems require some ingenuity. Problems marked with a star are more difficult. I have tried to make both the Exercises and Problems interesting chal¬ lenges. THE FIRST EDITION Introduction to the Theory of Computation first appeared as a Preliminary Edition in paperback. The first edition differs from the Preliminary Edition in several substantial ways. The final three chapters are new: Chapter 8 on space complex¬ ity; Chapter 9 on provable intractability; and Chapter 10 on advanced topics in complexity theory. Chapter 6 was expanded to include several advanced topics in computability theory. Other chapters were improved through the inclusion of additional examples and exercises. Comments from instructors and students who used the Preliminary Edition were helpful in polishing Chapters 0-7. Of course, the errors they reported have been corrected in this edition. Chapters 6 and 10 give a survey of several more advanced topics in com¬ putability and complexity theories. They are not intended to comprise a cohesive unit in the way that the remaining chapters are. These chapters are included to allow the instructor to select optional topics that may be of interest to the serious student. The topics themselves range widely. Some, such as Turing reducibility and alternation, are direct extensions of other concepts in the book. Others, such as decidable logical theories and cryptography, are brief introductions to large fields. FEEDBACK TO THE AUTHOR The internet provides new opportunities for interaction between authors and readers. I have received much e-mail offering suggestions, praise, and criticism, xiv PREFACE TO THE FIRST EDITION and reporting errors for the Preliminary Edition. Please continue to correspond! I try to respond to each message personally, as time permits. The e-mail address for correspondence related to this book is sipserbook@math.mit.edu. A web site that contains a list of errata is maintained. Other material may be added to that site to assist instructors and students. Let me know what you would like to see there. The location for that site is http://www-math.mit.edu/~sipser/book.html. ACKNOWLEDGMENTS I could not have written this book without the help of many friends, colleagues, and my family. I wish to thank the teachers who helped shape my scientific viewpoint and educational style. Five of them stand out. My thesis advisor, Manuel Blum, is due a special note for his unique way of inspiring students through clarity of thought, enthusiasm, and caring. He is a model for me and for many others. I am grateful to Richard Karp for introducing me to complexity theory, to John Addison for teaching me logic and assigning those wonderful homework sets, to Juris Hartmanis for introducing me to the theory of computation, and to my father for introducing me to mathematics, computers, and the art of teaching. This book grew out of notes from a course that I have taught at MIT for the past 15 years. Students in my classes took these notes from my lectures. I hope they will forgive me for not listing them all. My teaching assistants over the years, Avrim Blum, Thang Bui, Andrew Chou, Benny Chor, Stavros Cos- madakis, Aditi Dhagat, Wayne Goddard, Parry Husbands, Dina Kravets, Jakov Kucan, Brian O’Neill, Ioana Popescu, and Alex Russell, helped me to edit and expand these notes and provided some of the homework problems. Nearly three years ago, Tom Leighton persuaded me to write a textbook on the theory of computation. I had been thinking of doing so for some time, but it took Tom’s persuasion to turn theory into practice. I appreciate his generous advice on book writing and on many other things. I wish to thank Eric Bach, Peter Beebee, Cris Calude, Marek Chrobak, Anna Chefter, Guang-Ien Cheng, Elias Dahlhaus, Michael Fischer, Steve Fisk, Lance Fortnow, Henry J. Friedman, Jack Fu, Seymour Ginsburg, Oded Goldreich, Brian Grossman, David Harel, Micha Hofri, Dung T. Huynh, Neil Jones, H. Chad Lane, Kevin Lin, Michael Loui, Silvio Micali, Tadao Murata, Chris¬ tos Papadimitriou, Vaughan Pratt, Daniel Rosenband, Brian Scassellati, Ashish Sharma, Nir Shavit, Alexander Shen, Ilya Shlyakhter, Matt Stallmann, Perry Susskind, Y. C. Tay, Joseph Traub, Osamu Watanabe, Peter Widmayer, David Williamson, Derick Wood, and Charles Yang for comments, suggestions, and assistance as the writing progressed. The following people provided additional comments that have improved this book: Isam M. Abdelhameed, Eric Allender, Shay Artzi, Michelle Ather- PREFACE TO THE FIRST EDITION XV ton, Rolfe Blodgett, A1 Briggs, Brian E. Brooks, Jonathan Buss, Jin Yi Cai, Steve Chapel, David Chow, Michael Ehrlich, Yaakov Eisenberg, Farzan Fallah, Shaun Flisakowski, Hjalmtyr Hafsteinsson, C. R. Hale, Maurice Herlihy, Vegard Holmedahl, Sandy Irani, Kevin Jiang, Rhys Price Jones, James M. Jowdy, David M. Martin Jr., Manrique Mata-Montero, Ryota Matsuura, Thomas Minka, Farooq Mohammed, Tadao Murata, Jason Murray, Hideo Nagahashi, Kazuo Ohta, Constantine Papageorgiou, Joseph Raj, Rick Regan, Rhonda A. Reumann, Michael Rintzler, Arnold L. Rosenberg, Larry Roske, Max Rozenoer, Walter L. Ruzzo, Sanatan Sahgal, Leonard Schulman, Steve Seiden, Joel Seiferas, Ambuj Singh, David J. Stucki, Jayram S. Thathachar, H. Venkateswaran, Tom Whaley, Christopher Van Wyk, Kyle Young, and Kyoung Hwan Yun. Robert Sloan used an early version of the manuscript for this book in a class that he taught and provided me with invaluable commentary and ideas from his experience with it. Mark Herschberg, Kazuo Ohta, and Latanya Sweeney read over parts of the manuscript and suggested extensive improvements. Shaft Goldwasser helped me with material in Chapter 10. I received expert technical support from William Baxter at Superscript, who wrote the MTjgX macro package implementing the interior design, and from Larry Nolan at the MIT mathematics department, who keeps everything run- ning. It has been a pleasure to work with the folks at PWS Publishing in creat¬ ing the final product. I mention Michael Sugarman, David Dietz, Elise Kaiser, Monique Calello, Susan Garland and Tanja Brull because I have had the most contact with them, but I know that many others have been involved, too. Thanks to Jerry Moore for the copy editing, to Diane Levy for the cover design, and to Catherine Hawkes for the interior design. I am grateful to the National Science Foundation for support provided under grant CCR-9503 322. My father, Kenneth Sipser, and sister, Laura Sipser, converted the book di¬ agrams into electronic form. My other sister, Karen Fisch, saved us in various computer emergencies, and my mother, Justine Sipser, helped out with motherly advice. I thank them for contributing under difficult circumstances, including insane deadlines and recalcitrant software. Finally, my love goes to my wife, Ina, and my daughter, Rachel. Thanks for putting up with all of this. Cambridge , Massachusetts October, 1996 Michael Sipser PREFACE TO THE SECOND EDITION Judging from the email communications that I’ve received from so many of you, the biggest deficiency of the first edition is that it provides no sample solutions to any of the problems. So here they are. Every chapter now contains a new Selected Solutions section that gives answers to a representative cross-section of that chapter’s exercises and problems. To make up for the loss of the solved problems as interesting homework challenges, I’ve also added a variety of new problems. Instructors may request an Instructor’s Manual that contains addi¬ tional solutions by contacting the sales representative for their region designated at www.course.com. A number of readers would have liked more coverage of certain “standard” topics, particularly the Myhill-Nerode Theorem and Rice’s Theorem. I’ve par¬ tially accommodated these readers by developing these topics in the solved prob¬ lems. I did not include the Myhill-Nerode Theorem in the main body of the text because I believe that this course should provide only an introduction to finite automata and not a deep investigation. In my view, the role of finite automata here is for students to explore a simple formal model of computation as a prelude to more powerful models, and to provide convenient examples for subsequent topics. Of course, some people would prefer a more thorough treatment, while others feel that I ought to omit all reference to (or at least dependence on) finite automata. I did not include Rice’s Theorem in the main body of the text be¬ cause, though it can be a useful “tool” for proving undecidability, some students might use it mechanically without really understanding what is going on. Using XVII • •• XVIII PREFACE TO THE SECOND EDITION reductions instead, for proving undecidability, gives more valuable preparation for the reductions that appear in complexity theory. I am indebted to my teaching assistants, Ilya Baran, Sergi Elizalde, Rui Fan, Jonathan Feldman, Venkatesan Guruswami, Prahladh Harsha, Christos Kapout- sis, Julia Khodor, Adam Klivans, Kevin Matulef, Ioana Popescu, April Rasala, Sofya Raskhodnikova, and Iuliu Vasilescu who helped me to craft some of the new problems and solutions. Ching Law, Edmond Kayi Lee, and Zulfikar Ramzan also contributed to the solutions. I thank Victor Shoup for coming up with a simple way to repair the gap in the analysis of the probabilistic primality algorithm that appears in the first edition. I appreciate the efforts of the people at Course Technology in pushing me and the other parts of this project along, especially Alyssa Pratt and Aimee Poirier. Many thanks to Gerald Eisman, Weizhen Mao, Rupak Majumdar, Chris Umans, and Christopher Wilson for their reviews. Em indebted to Jerry Moore for his superb job copy editing and to Laura Segel of ByteGraphics (lauras@bytegraph.ics. com) for her beautifully precise rendition of the fig¬ ures. The volume of email I’ve received has been more than I expected. Hearing from so many of you from so many places has been absolutely delightful, and I’ve tried to respond to all eventually—my apologies for those I missed. I’ve listed here the people who made suggestions that specifically affected this edition, but I thank everyone for their correspondence. Luca Aceto, Arash Afkanpour, Rostom Aghanian, Eric Allender, Karun Bak- shi, Brad Ballinger, Ray Bartkus, Louis Barton, Arnold Beckmann, Mihir Bel- lare, Kevin Trent Bergeson, Matthew Berman, Rajesh Bhatt, Somenath Biswas, Lenore Blum, Mauro A. Bonatti, Paul Bondin, Nicholas Bone, Ian Bratt, Gene Browder, Doug Burke, Sam Buss, Vladimir Bychkovsky, Bruce Carneal, Soma Chaudhuri, Rong-Jaye Chen, Samir Chopra, Benny Chor, John Clausen, Alli¬ son Coates, Anne Condon, Jeffrey Considine, John J. Crashed, Claude Crepeau, Shaun Cutts, Susheel M. Daswani, Geoff Davis, Scott Dexter, Peter Drake, Jeff Edmonds, Yaakov Eisenberg, Kurtcebe Eroglu, Georg Essl, Alexander T. Fader, Farzan Fallah, Faith Fich, Joseph E. Fitzgerald, Perry Fizzano, David Ford, Jeannie Fromer, Kevin Fu, Atsushi Fujioka, Michel Galley, K. Gane- san, Simson Garfinkel, Travis Gebhardt, Peymann Gohari, Ganesh Gopalakr- ishnan, Steven Greenberg, Larry Griffith, Jerry Grossman, Rudolf de Haan, Michael Halper, Nick Harvey, Mack Hendricks, Laurie Hiyakumoto, Steve Hockema, Michael Hoehle, Shahadat Hossain, Dave Isecke, Ghaith Issa, Raj D. Iyer, Christian Jacobi, Thomas Janzen, Mike D. Jones, Max Kanovitch, Aaron Kaufman, Roger Khazan, Sarfraz Khurshid, Kevin Killourhy, Seungjoo Kim, Victor Kuncak, Kanata Kuroda, Suk Y. Lee, Edward D. Legenski, Li-Wei Lehman, Kong Lei, Zsolt Lengvarszky, Jeffrey Levetin, Baekjun Lim, Karen Livescu, Thomas Lasko, Stephen Louie, TzerHung Low, Wolfgang Maass, Arash Madani, Michael Manapat, Wojciech Marchewka, David M. Martin Jr., Anders Martinson, Lyle McGeoch, Alberto Medina, Kurt Mehlhorn, Nihar Mehta, Albert R. Meyer, Thomas Minka, Mariya Minkova, Daichi Mizuguchi, G. Allen Morris III, Damon Mosk-Aoyama, Xiaolong Mou, Paul Muir, Ger- PREFACE TO THE SECOND EDITION XIX man Muller, Donald Nelson, Gabriel Nivasch, Mary Obelnicki, Kazuo Ohta, Thomas M. Oleson, Jr., Curtis Oliver, Owen Ozier, Rene Peralta, Alexander Perlis, Holger Petersen, Detlef Plump, Robert Prince, David Pritchard, Bina Reed, Nicholas Riley, Ronald Rivest, Robert Robinson, Christi Rockwell, Phil Rogaway, Max Rozenoer, John Rupf, Teodor Rus, Larry Ruzzo, Brian Sanders, Cem Say, Kim Schioett, Joel Seiferas, Joao Carlos Setubal, Geoff Lee Seyon, Mark Skandera, Bob Sloan, Geoff Smith, Marc L. Smith, Stephen Smith, Alex C. Snoeren, Guy St-Denis, Larry Stockmeyer, Radu Stoleru, David Stuck!, Hisham M. Sueyllam, Kenneth Tam, Elizabeth Thompson, Michel Toulouse, Eric Tria, Chittaranjan Tripathy, Dan Trubow, Hiroki Ueda, Giora Unger, Kurt L. Van Etten, Jesir Vargas, Bienvenido Velez-Rivera, Kobus Vos, Alex Vrenios, Sven Waibel, Marc Waldman, Tom Whaley, Anthony Widjaja, Sean Williams, Joseph N. Wilson, Chris Van Wyk, Guangming Xing, Vee Voon Yee, Cheng Yongxi, Neal Young, Timothy Yuen, Kyle Yung, Jinghua Zhang, Lilia Zollei. Most of all I thank my family—Ina, Rachel, and Aaron—for their patience, understanding, and love as I sat for endless hours here in front of my computer screen. Ca?nbridge , Massachusetts December, 2004 Michael Sipser EXERCISES 25 f\, we know that P k+1 = P k M - Y. Therefore, using the induction hypothesis to calculate P k , ' M k - r PM K - Y M- 1 Multiplying through by M and rewriting Y yields M -Y. fc+i = PM k+1 - Y ( M k+1 - M \ M - 1 PM k+1 - Y ( M>C+1 ~ 1 -Y M - 1 M - 1 v M -l Thus the formula is correct for t = k + 1, which proves the theorem. Problem 0.14 asks you to use the preceding formula to calculate actual mort¬ gage payments. i j: » m )! m w * m it is • n m * a it « *: a « # * * a v EXERCISES 0.1 Examine the following formal descriptions of sets so that you understand which members they contain. Write a short informal English description of each set. a. {1,3,5,7,...} b. {...,-4,-2,0,2,4, ...} c. {n\ n — 2 m for some m in M} d. {n| n = 2 m for some m in H, and n = 3k for some k in N} e. {ui| w is a string of Os and Is and w equals the reverse of w} f. {n\n is an integer and n = n 4- 1} 0.2 Write formal descriptions of the following sets a. The set containing the numbers 1,10, and 100 b. The set containing all integers that are greater than 5 c. The set containing all natural numbers that are less than 5 d. The set containing the string aba e. The set containing the empty string f. The set containing nothing at all 26 CHAPTER O / INTRODUCTION 0.3 Let A be the set {x, y, z} and B be the set {x, y}. a. Is A a subset of B? b. Is B a subset of .4? c. What is A U B? d. What is A n B} e. What is A x B? f. What is the power set of B? 0.4 If A has a elements and B has b elements, how many elements are in A x B} Explain your answer. 0.5 If C is a set with c elements, how many elements are in the power set of (7? Explain your answer. 0.6 Let X be the set {1,2, 3,4,5} and Y be the set {6,7, 8,9,10}. The unary function /: X —> Y and the binary function g : X xY —> Y are described in the following tables. n /(«) 9 6 7 8 9 10 i 6 1 10 10 10 10 10 2 7 2 7 8 9 10 6 3 6 3 7 7 8 8 9 4 7 4 9 8 7 6 10 5 6 5 6 6 6 6 6 a. WTtat is the value of /(2)? b. What are the range and domain of /? c. WTiat is the value of g{2, 10)? d. What are the range and domain of g ? e. What is the value of g(4, /(4))? 0.7 For each part, give a relation that satisfies the condition. a. Reflexive and symmetric but not transitive b. Reflexive and transitive but not symmetric c. Symmetric and transitive but not reflexive 0.8 Consider the undirected graph G = (V, E) where V, the set of nodes, is (1, 2, 3, 4} and E, the set of edges, is {{1,2}, {2,3}, {1,3}, {2,4}, {1,4}}. Draw the graph G. What is the degree of node 1? of node 3? Indicate a path from node 3 to node 4 on your drawing of G. PROBLEMS 27 0.9 Write a formal description of the following graph. ■ m ft PROBLEMS 0.10 Find the error in the following proof that 2=1. Consider the equation a = b. Multiply both sides by a to obtain a 2 = ab. Subtract 6 2 from both sides to get a 2 — b 2 = ab — b 2 . Now factor each side, (a + b)(a — b) = b(a — b), and divide each side by (a — b), to get a + b = b. Finally, let a and b equal 1, which shows that 2=1. 0.11 Find the error in the following proof that all horses are the same color. CLAIM: In any set of h horses, all horses are the same color. PROOF: By induction on h. Basis: For h — 1. In any set containing just one horse, all horses clearly are the same color. Induction step: For k > 1 assume that the claim is true for h = k and prove that it is true for h = k + 1. Take any set H of k + 1 horses. We show that all the horses in this set are the same color. Remove one horse from this set to obtain the set Hi with just k horses. By the induction hypothesis, all the horses in Hi are the same color. Now replace the removed horse and remove a different one to obtain the set Hi. By the same argument, all the horses in Hz are the same color. Therefore all the horses in H must be the same color, and the proof is complete. 0.12 Show that every graph with 2 or more nodes contains two nodes that have equal degrees. a * 0.13 Ramsey’s theorem. Let G be a graph. A clique in G is a subgraph in which every two nodes are connected by an edge. An anti-clique, also called an independent set, is a subgraph in which every two nodes are not connected by an edge. Show that every graph with n nodes contains either a clique or an anti-clique with at least | log 2 n nodes. 28 CHAPTER O / INTRODUCTION a 0.14 Use Theorem 0.25 to derive a formula for calculating the size of the monthly pay¬ ment for a mortgage in terms of the principal P, interest rate I, and the number of payments t. Assume that, after t payments have been made, the loan amount is reduced to 0. Use the formula to calculate the dollar amount of each monthly pay¬ ment for a 30-year mortgage with 360 monthly payments on an initial loan amount of $100,000 with a 5% annual interest rate. m 1 SELECTED SOLUTIONS 0.13 Make space for two piles of nodes, A and B. Then, starting with the entire graph, repeatedly add each remaining node x to A if its degree is greater than one half the number of remaining nodes and to B otherwise, and discard all nodes to which x isn’t (is) connected if it was added to A ( B). Continue until no nodes are left. At most half of the nodes are discarded at each of these steps, so at least log 2 n steps will occur before the process terminates. Each step adds a node to one of the piles, so one of the piles ends up with at least \ log 2 n nodes. The A pile contains the nodes of a clique and the B pile contains the nodes of an anti-clique. 0.14 We let P t = 0 and solve for Y to get the formula: Y = PM t (M — 1 )/(M t — 1). For P = $100, 000, I = 0.05, and t = 360 we have M = 1 + (0.05)/12. We use a calculator to find that Y ~ $536.82 is the monthly payment. EXERCISES 83 EXERCISES A l.l The following are the state diagrams of two DFAs, Mi and M 2 . Answer the follow¬ ing questions about each of these machines. a. What is the start state? b. What is the set of accept states? c. What sequence of states does the machine go through on input aabb? d. Does the machine accept the string aabb? e. Does the machine accept the string e? a 1.2 Give the formal description of the machines Mi and M 2 pictured in Exercise 1.1. 1.3 The formal description of a DFA M is ({<?i, <72,93, <74,9 b}, {u, d}, 6, 93, {93}), where 5 is given by the following table. Give the state diagram of this machine. u d <71 <n <72 <72 <71 <73 <73 <72 ?4 <74 <73 <75 <75 <74 <75 1.4 Each of the following languages is the intersection of two simpler languages. In each part, construct DFAs for the simpler languages, then combine them using the construction discussed in footnote 3 (page 46) to give the state diagram of a DFA for the language given. In all parts E = {a, b}. a, (w| w has at least three a’s and at least two b’s} A b. {w| w has at exactly two a’s and at least two b’s} c. {«;[ w has an even number of a’s and one or two b’s} A d. {w\w has an even number of a’s and each a is followed by at least one b} e. {w| w has an even number of a’s and one or two b’s} f. {w;| w has an odd number of a’s and ends with a b} g. {u>| w has even length and an odd number of a’s} 84 CHAPTER I / REGULAR LANGUAGES 1.5 Each of the following languages is the complement of a simpler language. In each part, construct a DFA for the simpler language, then use it to give the state diagram of a DFA for the language given. In all parts E = {a, b}. A a. {w| w does not contain the substring ab} A b. { w\ w does not contain the substring baba} c. {v)\ w contains neither the substrings ab nor ba} d. {u;| w is any string not in a*b* } e. {w| w is any string not in (ab + )*} f. {w[ w is any string not in a* U b* } g. {v;\ w is any string that doesn’t contain exactly two a’s} h. {)/.>! w is any string except a and b} 1.6 Give state diagrams of DFAs recognizing the following languages. In all parts the alphabet is {0,1} a. {ui[ w begins with a 1 and ends with a 0} b. {wj w contains at least three Is} c. {w| w contains the substring 0101, i.e., w = xOlOly for some x and y} d. {wj w has length at least 3 and its third symbol is a 0} e. {w| w starts with 0 and has odd length, or starts with 1 and has even length} f. {w| w doesn’t contain the substring 110} g. {w| the length of w is at most 5} h. {w | w is any string except 11 and 111} i. {w| every odd position of w is a 1} j. {w| w contains at least two 0s and at most one 1} k. {£,0} l. {tu| w contains an even number of 0s, or contains exactly two Is} m. The empty set n. All strings except the empty string 1.7 Give state diagrams of NFAs with the specified number of states recognizing each of the following languages. In all parts the alphabet is {0,1}. A a. The language {w| w ends with 00} with three states b. The language of Exercise 1.6c with five states c. The language of Exercise 1.61 with six states d. The language {0} with two states e. The language 0* 1* 0 + with three states A f. The language 1*(001 + )* with three states g. The language {e} with one state h. The language 0* with one state 1.8 Use the construction given in the proof of Theorem 1.45 to give the state diagrams of NFAs recognizing the union of the languages described in a. Exercises 1.6a and 1.6b. b. Exercises 1.6c and 1.6f. EXERCISES 85 1 .9 Use the construction given in the proof of Theorem 1.47 to give the state diagrams of NFAs recognizing the concatenation of the languages described in a. Exercises 1.6g and 1.6i. b. Exercises 1.6b and 1 . 6 m. 1.10 Use the construction given in the proof of Theorem 1.49 to give the state diagrams of NFAs recognizing the star of the language described in a. Exercise 1.6b. b. Exercise 1.6j. c. Exercise 1.6m. A 1.11 Prove that every N FA can be converted to an equivalent one that has a single accept state. 1.12 Let D — { w | V! contains an even number of a’s and an odd number of b’s and does not contain the substring ab}. Give a DFA with five states that recognizes D and a regular expression that generates D. (Suggestion: Describe D more simply.) 1.13 Let F be the language of all strings over { 0 , 1 } that do not contain a pair of Is that are separated by an odd number of symbols. Give the state diagram of a DFA with 5 states that recognizes F. (You may find it helpful first to find a 4-state NFA for the complement of F.) 1.14 a. Show that, if M is a DFA that recognizes language B, swapping the accept and nonaccept states in M yields a new DFA that recognizes the complement of B. Conclude that the class of regular languages is closed under comple¬ ment. b. Show by giving an example that, if M is an NFA that recognizes language C, swapping the accept and nonaccept states in M doesn’t necessarily yield a new NFA that recognizes the complement of C. Is the class of languages recognized by NFAs closed under complement? Explain your answer. 1.15 Give a counterexample to show that the following construction fails to prove The¬ orem 1.49, the closure of the class of regular languages under the star operation.^ Let Ah = (Qi, £,<5i, < 7 i,Ei) recognize A\. Construct N = (Qi,H,6,qi,F) as follows. N is supposed to recognize A\. a. The states of N are the states of N\. b. The start state of N is the same as the start state of N\. c. F={< 7 i}uFi. The accept states F are the old accept states plus its start state. d. Define <5 so that for any q £ Q and any a 6 £ e , q^F l0 ra^e }5i(q,a) U { 71 } q £ Fi and a = e. (Suggestion: Show this construction graphically, as in Figure 1.50.) 8 ln other words, you must present a finite automaton, N \, for which the constructed automaton N does not recognize the star of A) ’s language. 86 CHAPTER 1 / REGULAR LANGUAGES 1.16 Use the construction given in Theorem 1.39 to convert the following two nonde- terministic finite automata to equivalent deterministic finite automata. 1.17 a. Give an NFA recognizing the language (01 U 001 U 010)*. b. Convert this NFA to an equivalent DFA. Give only the portion of the DFA that is reachable from the start state. 1.18 Give regular expressions generating the languages of Exercise 1.6. 1.19 Use the procedure described in Lemma 1.55 to convert the following regular ex¬ pressions to nondeterministic finite automata. a. (0U 1)*000(0U 1)* b. (((00)*(11))U01)* c. 0* 1.20 For each of the following languages, give two strings that are members and two strings that are not members—a total of four strings for each part. Assume the alphabet £ = {a,b} in all parts. a. a*b* b. a(ba)*b c. a* U b* d. (aaa)* e. £*a£*b£*a£* f. aba U bab g. (eUa)b h. (aUbaUbb)S* 1.21 Use the procedure described in Lemma 1.60 to convert the following finite au¬ tomata to regular expressions. EXERCISES 87 1.22 In certain programming languages, comments appear between delimiters such as /# and #/. Let C be the language of all valid delimited comment strings. A member of C must begin with /# and end with #/ but have no intervening #/. For simplic¬ ity, we’ll say that the comments themselves are written with only the symbols a and b; hence the alphabet of C is E = {a, b, /, #}. a. Give a DFA that recognizes C. b. Give a regular expression that generates C. A 1.23 Let B be any language over the alphabet E. Prove that B = B + iff BB C B. 1.24 A finite state transducer (FST) is a type of deterministic finite automaton whose output is a string and not just accept or reject. The following are state diagrams of finite state transducers T\ and 7 A 0/0 1/1 T, Each transition of an FST is labeled with two symbols, one designating the input symbol for that transition and the other designating the output symbol. The two symbols are written with a slash, /, separating them. In Ti, the transition from 7 i to c /2 has input symbol 2 and output symbol 1. Some transitions may have multiple input-output pairs, such as the transition in T\ from qi to itself. When an FST computes on an input string w, it takes the input symbols wi ■ ■ ■ w n one by one and, starting at the start state, follows the transitions by matching the input labels with the sequence of symbols wi ■ ■ ■ w„ = w. Every time it goes along a transition, it outputs the corresponding output symbol. For example, on input 2212011, machine Ti enters the sequence of states qi, < 72 , 72 , 72 , 72 , 71 , 71 , 71 and produces output 1111000. On input abbb, T 2 outputs 1011. Give the sequence of states entered and the output produced in each of the following parts. e. T 2 on input b f. T 2 on input bbab g. T 2 on input bbbbbb h. T -2 on input e 1.2 5 Read the informal definition of the finite state transducer given in Exercise 1.24. Give a formal definition of this model, following the pattern in Definition 1.5 (page 35). Assume that an FST has an input alphabet E and an output alphabet V but not a set of accept states. Include a formal definition of the computation of an FST. (Hint: An FST is a 5-tuple. Its transition function is of the form 5: Qx E— >Qx T.) a. T\ on input 011 b. Ti on input 211 c. Ti on input 121 d. Ti on input 0202 88 CHAPTER 1 / REGULAR LANGUAGES 1.26 Using the solution you gave to Exercise 1.25, give a formal description of the ma¬ chines Ti and Th depicted in Exercise 1.24. 1.27 Read the informal definition of the finite state transducer given in Exercise 1.24. Give the state diagram of an FST with the following behavior. Its input and output alphabets are {0,1}. Its output string is identical to the input string on the even positions but inverted on the odd positions. For example, on input 0000111 it should output 1010010. 1.28 Convert the following regular expressions to NFAs using the procedure given in Theorem 1.54. In all parts E = {a,b}. a. a(abb)* U b b. a + U(ab) + c. (aUb*)aV 1.29 Use the pumping lemma to show that the following languages are not regular. A a. Ai = {0"l n 2 n |n>0} b. Ai — {www | w € {a,b}*} A c. A3 = {a 2 | n > 0} (Here, a 2 means a string of 2 n a’s.) 1.30 Describe the error in the following “proof” that 0*1* is not a regular language. (An error must exist because 0*1* is regular.) The proof is by contradiction. Assume that 0*1* is regular. Let p be the pumping length for 0*1* given by the pumping lemma. Choose s to be the string 0 P 1 P . You know that s is a member of 0*1*, but Example 1.73 shows that s cannot be pumped. Thus you have a contradiction. So 0*1* is not regular. *4 «! « ft « PROBLEMS 1.31 For any string w = Wi w-2 ■ ■ ■ w n , the reverse of w, written w K , is the string w in reverse order, w n ■ ■ ■ wawi. For any language A, let A n — {w n \ w 6 A}. Show that if A is regular, so is A n . 1.32 Let E 3 = {[iH:]# ■•[!]}■ E3 contains all size 3 columns of 0s and Is. A string of symbols in E3 gives three rows of 0s and Is. Consider each row to be a binary number and let B = {w € E 3 1 the bottom row of w is the sum of the top two rows}. For example, but 0 1 0 0 1 1 Show that B is regular. (Hint: Working with B n is easier. You may assume the result claimed in Problem 1.31.) PROBLEMS 89 1.33 Let Here, E_) contains all columns of Os and Is of height two. A string of symbols in E -2 gives two rows of Os and Is. Consider each row to be a binary number and let C = {w £ E 2 I the bottom row of w is three times the top row}. For example, [ 0 ] [x] [ 1 ] [ 0 ] ^ ut [ 1 ] [ 1 ] [ 0 ] ^ C- Show that C is regular. (You may assume the result claimed in Problem 1.31.) 1.34 Let E 2 be the same as in Problem 1.33. Consider each row to be a binary number and let D = {w € Ej | the top row of w is a larger number than is the bottom row}. For example, [°] [ 0 ] [!] [ 0 ] e A but [ 0 ] [?] [1] [ 0 ] 2 D - Show that D is regular. 1.35 Let E 2 be the same as in Problem 1.33. Consider the top and bottom rows to be strings of Os and Is and let E — {w E £ 2 1 the bottom row of w is the reverse of the top row of «/}. Show that E is not regular. 1.36 Let B n = {a fe | where k is a multiple of n}. Show that for each n > 1, the language B n is regular. 1.37 Let C n = {x| x is a binary number that is a multiple of n}. Show that for each n > 1 , the language C n is regular. 1.38 An till-NfA M is a 5-tuple (Q, E, S. qo . F) that accepts x € E* if every possible state that M could be in after reading input x is a state from F . Note, in contrast, that an ordinary NFA accepts a string if some state among these possible states is an accept state. Prove that all-NFAs recognize the class of regular languages. 1.39 The construction in Theorem 1.54 shows that every GNFA is equivalent to a GNFA with only two states. We can show that an opposite phenomenon occurs for DFAs. Prove that for every k > 1 a language A k C {0,1}* exists that is recognized by a DFA with k states but not by one with only k — 1 states. 1.40 Say that string x is a prefix of string y if a string 2 exists where xz = y and that x is a proper prefix of y if in addition x f y. In each of the following parts we define an operation on a language A. Show that the class of regular languages is closed under that operation. A a. NOPREFIX (A) = {w € A\ no proper prefix of w is a member of A}. b. NOEXTEND(A) = {w G A \ w is not the proper prefix of any string in A}. 1.41 For languages A and B, let the perfect shuffle of A and B be the language {utj w = aifai ■ • • a.k&fc, where ai • • ■ a*, £ A and bi ■ ■ ■ bk G B, each a*, bi 6 E}. Show that the class of regular languages is closed under perfect shuffle. 1.42 For languages A and P, let the shuffle of A and B be the language {w| w = a\b\ ■ ■ ■ akbk , where ai ■ • • at € A and 61 • ■ ■ bk € B, each a,, 6 , e E*}. Show that the class of regular languages is closed under shuffle. 90 CHAPTER 1 /REGULAR LANGUAGES 1.43 Let A be any language. Define DROP-OUT(A) to be the language containing all strings that can be obtained by removing one symbol from a string in A. Thus, DROP-OUT(A) = {xz\ xyz £ A where x, z £ £*, y £ E}. Show that the class of regular languages is closed under the DROP-OUT operation. Give both a proof by picture and a more formal proof by construction as in Theorem 1.47. a 1.44 Let B and C be languages over E = {0, 1}. Define B C = {wEB | for some yEC, strings w and y contain equal numbers of Is}. Show that the class of regular languages is closed under the operation. *1.45 Let A/B = { tu| wx E A for some x E B}. Show that if A is regular and B is any language then A/B is regular. 1.46 Prove that the following languages are not regular. You may use the pumping lemma and the closure of the class of regular languages under union, intersection, and complement. a. {0 n l m 0 n | m,n > 0} A b. {0 m l n \m^n} c. {io| w € {0,1}* is not a palindrome}^ d. {wtw\ w,t € {0,1} + } 1.47 Let E = {1, #} and let Y = {w\ w = xi#X 2 # ■ ■ ■ #Xk for k > 0, each Xi € 1*, and Xi ^ Xj for i ± j}. Prove that Y is not regular. 1.48 Let E = {0,1} and let D = {w\w contains an equal number of occurrences of the substrings 01 and 10}. Thus 101 g D because 101 contains a single 01 and a single 10, but 1010 0 D because 1010 contains two 10s and one 01. Show that D is a regular language. 1.49 a. Let B = {t k y\ y £ {0,1}* and y contains at least k Is, for k ^11- Show that B is a regular language. b. Let C = {l k y\ y 6 {0,1}* and y contains at most k Is, for fc> 11- Show that C isn’t a regular language. a 1.50 Read the informal definition of the finite state transducer given in Exercise 1.24. Prove that no FST can output w n for every input w if the input and output alpha¬ bets are {0,1}. 1.51 Let x and y be strings and let L be any language. We say that x and y are distin¬ guishable by L if some string z exists whereby exactly one of the strings xz and yz is a member of L; otherwise, for every string z, we have xz 6 L whenever yz 6 L and we say that x and y are indistinguishable by L. If x and y are indistinguishable by L we write x =l y. Show that =l is an equivalence relation. ^A palindrome is a string that reads the same forward and backward. PROBLEMS 91 A * 1.52 Myhill-Nerode theorem. Refer to Problem 1.51. Let L be a language and let X be a set of strings. Say that X is pairwise distinguishable by L if every two distinct strings in X are distinguishable by L. Define the index of L to be the maximum number of elements in any set that is pairwise distinguishable by L. The index of L may be finite or infinite. a. Show that, if L is recognized by a DFA with k states, L has index at most k. b. Show that, if the index of L is a finite number k, it is recognized by a DFA with k states. c. Conclude that L is regular iff it has finite index. Moreover, its index is the size of the smallest DFA recognizing it. 1.53 Let E = { 0 , 1 ,+, =} and ADD = {x=y+z | x, y, z are binary integers, and x is the sum of y and 2 }. Show that ADD is not regular. 1.54 Consider the language F = {Ab j c k \ i,j, k > 0 and if i = 1 th enj — k}. a. Show that F is not regular. b. Show that F acts like a regular language in the pumping lemma. In other words, give a pumping length p and demonstrate that F satisfies the three conditions of the pumping lemma for this value of p. c. Explain why parts (a) and (b) do not contradict the pumping lemma. 1.55 The pumping lemma says that every regular language has a pumping length p, such that every string in the language can be pumped if it has length p or more. If p is a pumping length for language A, so is any length p > p. The minimum pumping length for A is the smallest p that is a pumping length for A. For example, if A = 01 *, the minimum pumping length is 2. The reason is that the string s = 0 is in A and has length 1 yet s cannot be pumped, but any string in A of length 2 or more contains a 1 and hence can be pumped by dividing it so that x = 0, y = 1 , and z is the rest. For each of the following languages, give the minimum pumping length and justify your answer. A a. 0001 * A b. 0 * 1 * c. 001 U 0*1* A d. 0*l + 0 + l*U10*1 e. ( 01 )* *1.56 If A is a set of natural numbers and A; is a natural number greater than 1, let Bk(A) = {«)| w is the representation in base fc of some number in A}. Here, we do not allow leading 0 s in the representation of a number. For example, £? 2 ({ 3 , 5}) = { 11 , 101 } and S 3 ({3, 5}) = { 10 , 12 }. Give an example of a set A for which B 2 (A) is regular but Bs(A) is not regular. Prove that your example works. 92 CHAPTER 1 / REGULAR LANGUAGES *1.57 If A is any language, let Ai_ be the set of all first halves of strings in A so that A i_ = {tc| for some y, \x\ = \y\ and xy E A}. Show that, if A is regular, then so is A i _ . ° 2 *1.58 If A is any language, let A±_ i be the set of all strings in A with their middle thirds removed so that At _i = {xz\ for some y , Jas| = |j/| = \z\ and xyz E A}. Show that, if A is regular, then Ai _ i is not necessarily regular. *1.59 Let M — (Q. E, <5, r/ () , F) be a DFA and let h be a state of M called its “home”. A synchronizing sequence for M and h is a string s E E* where S(q, s) = h for every q E Q. (Here we have extended S to strings, so that <5(<j, s) equals the state where M ends up when M starts at state q and reads input s.) Say that A 1 is synchronizable if it has a synchronizing sequence for some state h. Prove that, if M is a fc-state synchronizable DFA, then it has a synchronizing sequence of length at most k 3 . Can you improve upon this bound? 1.60 Let E = {a,b}. For each k > 1, let Ck be the language consisting of all strings that contain an a exactly k places from the right-hand end. Thus Ck = E*aE fc-1 . Describe an NFA with k + 1 states that recognizes Ck, both in terms of a state diagram and a formal description. 1.61 Consider the languages Ck defined in Problem 1.60. Prove that for each k, no DFA can recognize Ck with fewer than 2 k states. 1.62 Let E = {a, b}. For each k > 1, let Dk be the language consisting of all strings that have at least one a among the last k symbols. Thus Dk = E*a(E U E) k ~ 1 . Describe an DFA with at most k + 1 states that recognizes Dk, both in terms of a state diagram and a formal description. 1.63 a. Let A be an infinite regular language. Prove that A can be split into two infinite disjoint regular subsets. b. Let B and D be two languages. Write B e D if B C D and D contains infinitely many strings that are not in B. Show that, if B and D are two regular languages where B (£ D, then we can find a regular language C where B <e C € D. 1.64 Let N be an NFA with k states that recognizes some language A. a. Show that, if A is nonempty, A contains some string of length at most k. b. Show that, by giving an example, that part (a) is not necessarily true if you replace both A’s by A. c. Show that, if A is nonempty, A contains some string of length at most 2 k . d. Show that the bound given in part (b) is nearly tight; that is, for each k, demonstrate an NFA recognizing a language Ak where A/. : is nonempty and where A k ’s shortest member strings are of length exponential in k. Come as close to the bound in (b) as you can. SELECTED SOLUTIONS 93 ' 1.65 Prove that, for each n > 0, a language B n exists where a. B n is recognizable by a NFA that has n states, and b. if B n — Ai U • • • U Ak, for regular languages A, , then at least one of the Ai requires a DFA with exponentially many states. ‘S ’ s- M. &■> £ 'Z- K w. pi fi J # C S * #1 §* SELECTED SOLUTIONS 1.1 For Mi: (a) 91 ; (b) { 92 }; (c) q u q 2 , 93 , <?i, 9 i; (d) No; (e) No For M 2 : (a) qy, (b){qi,q 4 }; (c) q u q u 9 i> 92 , g 4 ; (d) Yes; (e) Yes 1.2 M 2 = ({ 91 , 92 , 53 }, {a,b},<5i,gi, { 92 }). A/3 = ({91,92,93,94}, {a, b},&>, 9t,{9i,94}). The transition functions are <5i a b a b 9i 92 9i 9i 91 92 92 93 93 92 93 94 93 92 9i 93 92 9i 94 93 94 1.4 (b) The following are DFAs for the two languages {u>| w has exactly three a’s} and {w\w has at least two b’s}: Combining them using the intersection construction gives the DFA: Though the problem doesn’t request you to simplify the DFA, certain states can be combined to give 94 CHAPTER 1 /REGULAR LANGUAGES (d) These are DFAs for the two languages {tn| w has an even number of a’s} and {w| each a is followed by at least one b}: Combining them using the intersection construction gives the DFA: Though the problem doesn’t request you to simplify the DFA, certain states can be combined to give SELECTED SOLUTIONS 95 1.5 (a) The left-hand DFA recognizes { w\ w contains ab}. The right-hand DFA recog¬ nizes its complement, {w| w doesn’t contain ab}. (b) This DFA recognizes { w\ w contains baba}. 1.7 (a) 0,1 1.11 Let N = ( Q , E, 6, go, F) be any NFA. Construct an NFA N' with a single accept state that accepts the same language as N. Informally, N' is exactly like N except it has e-transitions from the states corresponding to the accept states of N, to a new accept state, Accept- State (/accept has no emerging transitions. More formally, N' - {Q U {(/accept}, E, S', go, {(/accept}), where for each q e Q and a € E g, a > _ f <5(qr, a) if a / e or q 0 F \(%, a) U {(/accept} if a = £ and q G F and S' ((/accept , a) = 0 for each a 6 E e . 1.23 We prove both directions of the “iff.” (—>) Assume that B = B + and show that BB C B. For every language BB C B + holds, so if B = B*, then BB C B. (<—) Assume that BB C B and show that B = B*. For every language B C B*, so we need to show only B* C B. If w £ B + , then w — x\X -2 ■ ■ -xu where each Xi € B and k > 1. Because xi,X 2 € B and BB C B, we have X 1 X 2 € B. Similarly, because X 1 X 2 is in B and x-i is in B, we have X 1 X 2 X 3 € B. Continuing in this way, x\ ■ ■ ■ Xk € B. Hence w G B, and so we may conclude that B* C B. 96 CHAPTER 1 /REGULAR LANGUAGES The latter argument may be written formally as the following proof by induction. Assume that BB C B. Claim: For each fc > 1, if xi,..., x k £ B, then xi ■ ■ ■ x k 6 B. Basis: Prove for k — 1. This statement is obviously true. Induction step: For each k > 1, assume that the claim is true for k and prove it to be true for k + 1. If x\ ,..., Xfc, Xk+i G B, then by the induction assumption, x\ ■ ■ ■ Xk G B. There¬ fore x\ ■ ■ ■ ifcXfc+i € BB, but BB C B, so x\ ■ ■ • x k+1 £ B. That proves the induction step and the claim. The claim implies that, if BB C B, then B + C B. 1.29 (a) Assume that A\ = { 0 n l n 2 n \ n > 0} is regular. Let p be the pumping length given by the pumping lemma. Choose s to be the string 0 P 1 P 2 P . Because s is a member of A\ and s is longer than p, the pumping lemma guarantees that s can be split into three pieces, s = xyz, where for any i > 0 the string xy‘z is in A i. Consider two possibilities: 1. The string y consists only of Os, only of Is, or only of 2s. In these cases the string xyyz will not have equal numbers of Os, Is, and 2s. Hence xyyz is not a member of Ai, a contradiction. 2. The string y consists of more than one kind of symbol. In this case xyyz will have the Os, Is, or 2s out of order. Hence xyyz is not a member of Ai, a contradiction. Either way we arrive at a contradiction. Therefore, A\ is not regular. (c) Assume that A 3 = {a 1 2 3 4 " | n > 0} is regular. Letp be the pumping length given by the pumping lemma. Choose s to be the string a 2 ' . Because s is a member of Ai and ,s is longer than p, the pumping lemma guarantees that s can be split into three pieces, s = xyz, satisfying the three conditions of the pumping lemma. The third condition tells us that \xy\ < p. Furthermore, p < 2 P and so \y\ < 2 P . Therefore \xyyz\ — \xyz\ + |t/| < 2 P + 2 P = 2 P+1 . The second condition requires \y\ > 1 so2 p < \xyyz\ < 2 P+1 . The length of xyyz cannot be a power of 2. Hence xyyz is not a member of A 3 , a contradiction. Therefore, A 3 is not regular. 1.40 Let M = (Q,E,6,qo, F) be an NFA recognizing A, where A is some regular language. Construct M' — (Q\ E, 5', go', F') recognizing NOPREFIX (A) as follows: 1. Q' = Q. 2. For r £ Q' and a £ £ define S'(r,a) 3. qo = q 0 . 4. F’ = F. ( 5(r, a) if r ^ F \0 if r £ H SELECTED SOLUTIONS 97 1.44 Let Mb = (QbjT.Sb^b.Fb) and Me = (Qc,E, 5c,qc,Fc) be DFAs recog¬ nizing B and C respectively. Construct NFA M = (Q, E, S, go, F) that recognizes B -A C as follows. To decide whether its input w is in B C, the machine M checks that w £ B, and in parallel, nondeterministically guesses a string y that contains the same number of Is as contained in w and checks that y £ C. 1. Q = Qb x Qc- 2. For (g, r) £ Q and a £ E define ({(d B (q,0),r)} i fa = 0 S((q, r), a) = } {( S B (q , 1), &c(r, 1))} if a = 1 U(?ifc(r,0))} if a = e. 3. go = (qo.gc)- 4. F = Fb x Fc. 1.46 (b) Let B = (0 m l”| m A n}. Observe that B fl 0*1* = {0 A ' 1*| k > 0}. If B were regular, then B would be regular and so would B D 0* 1 *. But we already know that {0 k l k l k > 0} isn’t regular, so B cannot be regular. Alternatively, we can prove B to be nonregular by using the pumping lemma di¬ rectly, though doing so is trickier. Assume that B = {0"T"| rn A re} is regular. Let p be the pumping length given by the pumping lemma. Observe that p! is di¬ visible by all integers from 1 to p, where p\ = p(p — l)(p — 2) • • ■ 1. The string s = o p l p+p! e B, and |s| > p. Thus the pumping lemma implies that s can be di¬ vided as xyz with x = 0“, y = 0 b , and 2 = 0 e l pH p! , where b > 1 and a + b + c = p. Let s' be the string xy l+1 z , where i = p\/b. Then y’ = 0 p! so y l+1 = 0 b+p! , and so xyz = Q a + b + c +p'- 1 p+p! . That gives xyz = 0 p+p! i p+p! 0 B, a contradiction. 1.50 Assume to the contrary that some FST T outputs w n on input w. Consider the input strings 00 and 01. On input 00, T must output 00, and on input 01, T must output 10. In both cases the first input bit is a 0 but the first output bits differ. Operating in this way is impossible for an FST because it produces its first output bit before it reads its second input. Hence no such FST can exist. 1.52 (a) We prove this assertion by contradiction. Let M be a k-state DFA that recog¬ nizes L. Suppose for a contradiction that L has index greater than fc. That means some set X with more than fc elements is pairwise distinguishable by L. Because M has fc states, the pigeonhole principle implies that X contains two distinct strings x and y, where 6(qo, x) — 5(qo,y). Here 6(qo,x) is the state that M is in after start¬ ing in the start state qo and reading input string x. Then, for any string 2 £ E*, 8(qo,xz) = 5(qo,yz). Therefore either both xz and yz are in L or neither are in L. But then x and y aren’t distinguishable by L, contradicting our assumption that X is pairwise distinguishable by L. (b) Let X = {«!,...,«*} be pairwise distinguishable by L. We construct DFA M — (<5,E, 6, qo,F) with fc states recognizing L. Let Q — {qi,... ,qk}, and define 8(qi.a) to be qj, where Sj =l s,a (the relation =l is defined in Prob¬ lem 1.51). Note that Sj =l Sid for some .s ; € X; otherwise, X U Sia would have fc + 1 elements and would be pairwise distinguishable by L, which would contra¬ dict the assumption that L has index fc. Let F = {q, | Si £ L}. Let the start state qo be the qi such that Si =l £• M is constructed so that, for any state g,, {s| 8(qa,s) = qi} — {s| s ~l Si}. Hence M recognizes L. 98 CHAPTER 1 / REGULAR LANGUAGES (c) Suppose that L is regular and let k be the number of states in a DFA recognizing L. Then from part (a) L has index at most k. Conversely, if L has index k, then by part (b) it is recognized by a DFA with k states and thus is regular. To show that the index of L is the size of the smallest DFA accepting it, suppose that L’s index is exactly k. Then, by part (b), there is a fc-state DFA accepting L. That is the smallest such DFA because if it were any smaller, then we could show by part (a) that the index of L is less than k. 1.55 (a) The minimum pumping length is 4. The string 000 is in the language but cannot be pumped, so 3 is not a pumping length for this language. If s has length 4 or more, it contains Is. By dividing s onto xyz, where x is 000 and y is the first 1 and z is everything afterward, we satisfy the pumping lemma’s three conditions, (b) The minimum pumping length is 1. The pumping length cannot be 0 because the string e is in the language and it cannot be pumped. Every nonempty string in the language can be divided into xyz, where x = e and y is the first character and z is the remainder. This division satisfies the three conditions. (d) The minimum pumping length is 3. The pumping length cannot be 2 because the string 11 is in the language and it cannot be pumped. Let s be a string in the language of length at least 3. If s is generated by 0*l + 0 + T, we can write it as xyz, where x is the empty string, y is the first symbol of s, and 2 is the remainder of s. Breaking s up in this way shows that it can be pumped. If s is generated by 10* 1, we can write it as xyz, where a: = 1 and y = 0 and 2 is the remainder of s. This division gives a way to pump s. 1 28 CHAPTER 2 / CONTEXT-FREE LANGUAGES EXERCISES 2.1 Recall the CFG G 4 that we gave in Example 2.4. For convenience, let’s rename its variables with single letters as follows. E —> E + T \ T T —> T x F \ F F -> ( E) | a Give parse trees and derivations for each string. a. a c. a+a+a b. a+a d. ((a)) 2.2 a. Use the languages A — {a m b n c”| m,n > 0} and B = {a n b n c m | m, n > 0} together with Example 2.36 to show that the class of context-free languages is not closed under intersection. b. Use part (a) and DeMorgan’s law (Theorem 0.20) to show that the class of context-free languages is not closed under complementation. a 2.3 Answer each part for the following context-free grammar G. R — XRX | 5 S —> aTb | bTa T -► XTX I X | e X —> a | b a. What are the variables of G? i. True or False: T 4 - T. b. What are the terminals of G? j. True or False: XXX 4 - aba. c. Which is the start variable of G? k. True or False: X 4- aba. d. Give three strings in L(G). 1 . True or False: T 4 - XX. e. Give three strings not in L(G). m . True or False: T 4> XXX. f. True or False: T => aba. n. True or False: 5 4 e. g. True or False: T => aba. o. Give a description in English of h. True or False: T => T. L(G). 2.4 Give context-free grammars that generate the following languages. In all parts the alphabet E is {0,1}. A a. {w\ w contains at least three Is} b. {w\w starts and ends with the same symbol} c. {w| the length of w is odd} A d. {•«.' the length of w is odd and its middle symbol is a 0} e. {w| w = w n , that is, w is a palindrome} f. The empty set EXERCISES 129 2.5 Give informal descriptions and stare diagrams of pushdown automata for the lan¬ guages in Exercise 2.4. 2.6 Give context-free grammars generating the following languages. A a. The set of strings over the alphabet {a.b} with more a’s than b’s b. The complement of the language {a™b n | n > 0} A c. {w#x\ w K is a substring of a; for w,x € {0,1}*} d. {a;i#X 2 # ■ • • #ifc| fc > 1, each a-j e {a, b}*, and for some i and j, Xi = xf} a 2.7 Give informal English descriptions of PDAs for the languages in Exercise 2.6. a 2.8 Show that the string the girl touches the boy with the flower has two different leftmost derivations in grammar Go on page 101. Describe in English the two different meanings of this sentence. 2,9 Give a context-free grammar that generates the language A = {a‘b J c fc | i = j or j = k where i,j, k > 0}. Is your grammar ambiguous? Why or why not? 2.10 Give an informal description of a pushdown automaton that recognizes the lan¬ guage A in Exercise 2.9. 2.11 Convert the CFG Ch given in Exercise 2.1 to an equivalent PDA, using the proce¬ dure given in Theorem 2.20. 2.12 Convert the CFG G given in Exercise 2.3 to an equivalent PDA, using the procedure given in Theorem 2.20. 2.13 Let G = ( V , E, R, S) be the following grammar. V = {5, T, U}; E = {0, #}; and R is the set of rules: S — TT | U T -» OT | TO | # U -» Of/00 | # a. Describe L(G) in English. b. Prove that L(G) is not regular. 2.14 Convert the following CFG into an equivalent CFG in Chomsky normal form, using the procedure given in Theorem 2.9. A -* BAB \B\e B —> 00]e 2.15 Give a counterexample to show that the following construction fails to prove that the class of context-free languages is closed under star. Let A be a CFL that is generated by the CFG G = (V, E, R, S). Add the new rule S —* SS and call the resulting grammar G'. This grammar is supposed to generate A*. 2.16 Show that the class of context-free languages is closed under the regular operations, union, concatenation, and star. 2.17 Use the results of Problem 2.16 to give another proof that every regular language is context free, by showing how to convert a regular expression directly to an equiv¬ alent context-free grammar. 130 CHAPTER 2 / CONTEXT-FREE LANGUAGES PROBLEMS a 2.18 a. Let C be a context-free language and R be a regular language. Prove that the language C D R is context free. b. Use part (a) to show that the language A = {tu| w £ {a, b, c}* and contains equal numbers of a’s, b’s, and c’s} is not a CFL. *2.19 Let CFG G be S — a5b | bF | Ya Y -» bY | aY | e Give a simple description of L(G) in English. Use that description to give a CFG for L(G), the complement of L(G). 2.20 Let A/B = {w[ wx £ A for some x £ B}. Show that, if A is context free and B is regular, then A/B is context free. *2.21 Let E = {a,b}. Give a CFG generating the language of strings with twice as many a’s as b’s. Prove that your grammar is correct. *2.22 Let C = {x#y\ x, y € {0,1}* and x ^ y}. Show that C is a context-free language. *2.23 LetU = {xy\x,y € {0,1}* and |x| = |t/| but a: ^ y}. Show that I) is a context-free language. *2.24 Let E = {a‘b J '| i # j and 2 i ^ j}. Show that E is a context-free language. 2.25 For any language A, let SUFFIX(A) = {n| uv € A for some string u}. Show that the class of context-free languages is closed under the SUFFIX operation. 2.26 Show that, if G is a CFG in Chomsky normal form, then for any string w 6 L(G) of length n > 1, exactly 2n — 1 steps are required for any derivation of w. *2.27 Let G = (V, E, R, (STMT)) be the following grammar. (STMT) —>■ (ASSIGN) | (IF-THEN) | (IF-THEN-ELSE) (IF-THEN) -> if condition then (STMT) (IF-THEN-ELSE) -> if condition then (STMT) else (STMT) (ASSIGN) -+ a:=l E = {if, condition, then, else, a: =1}. V = {(STMT), (IF-THEN), (IF-THEN-ELSE), (ASSIGN)} G is a natural-looking grammar for a fragment of a programming language, but G is ambiguous. a. Show that G is ambiguous. b. Give a new unambiguous grammar for the same language. *2.28 Give unambiguous CFGs for the following languages. a. {tn[ in every prefix of w the number of a’s is at least the number of b’s} b. {w| the number of a’s and b’s in w are equal} c. {w\ the number of a’s is at least the number of b’s} *2.29 Show that the language A in Exercise 2.9 is inherently ambiguous. PROBLEMS 131 2.30 Use the pumping lemma to show that the following languages are not context free. a. {0 n l"0”l”| n > 0} A b. {0"#0 2n #0 3n | ra > 0} A c. {w#t| v> is a substring of t, where w,t € {a,b}*} d. {tj #t. 2 # ■ ■ ■ tttk | k > 2, each ti 6 {a, b}*, and U = tj for some i j} 2.31 Let B be the language of all palindromes over {0,1} containing an equal number of Os and Is. Show that B is not context free. *2.32 Let £ = {1, 2, 3,4} and C = {w £ £*| in w, the number of Is equals the number of 2s, and the number of 3s equals the number of 4s}. Show that C is not context free. 2.33 Show that F = {a'b 1 1 i -jk kj for every positive integer k} is not context free. 2.34 Consider the language B = L(G), where G is the grammar given in Exercise 2.13. The pumping lemma for context-free languages, Theorem 2.34, states the exis¬ tence of a pumping length p for B. What is the minimum value of p that works in the pumping lemma? Justify your answer. 2.35 Let G be a CFG in Chomsky normal form that contains b variables. Show that, if G generates some string with a derivation having at least 2 h steps, L(G) is infinite. 2.36 Give an example of a language that is not context free but that acts like a CFL in the pumping lemma. Prove that your example works. (See the analogous example for regular languages in Problem 1.54.) *2.37 Prove the following stronger form of the pumping lemma, wherein both pieces v and y must be nonempty when the string s is broken up. If A is a context-free language, then there is a number k where, if s is any string in A of length at least k, then s may be divided into five pieces, s = uvxyz, satisfying the conditions: a. for each i > 0, uv'xy'z 6 .4, b. v ^ e and y e, and c. \vxy\ < k. a 2.38 Refer to Problem 1.41 for the definition of the perfect shuffle operation. Show that the class of context-free languages is not closed under perfect shuffle. 2.39 Refer to Problem 1.42 for the definition of the shuffle operation. Show that the class of context-free languages is not closed under shuffle. *2.40 Say that a language is prefix-closed if the prefix of any string in the language is also in the language. Let C be an infinite, prefix-closed, context-free language. Show that C contains an infinite regular subset. *2.41 Read the definitions of NOPREFIX (A) and NOEXTEND(A) in Problem 1.40. a. Show that the class of CFLs is not closed under NOPREFIX operation. b. Show that the class of CFLs is not closed under NOEXTEND operation. 2.42 Let S = {1,#} and Y = {w| w = • • #G for k > 0, each U 6 1*, and ti tj whenever i / j}. Prove that Y is not context free. 132 CHAPTER 2 / CONTEXT-FREE LANGUAGES 2.43 For strings w and t, write w = /.if the symbols of w are a permutation of the symbols of t. In other words, w = t if t and w have the same symbols in the same quantities, but possibly in a different order. For any string w, define SCRAMBLE(w) = {t\ t = w}. For any language A, let SCRAMBLE(A) = {t\te SCRAMBLER) for some w £ A}. a. Show that, if E = {0,1}, then the SCRAMBLE of a regular language is context free. b. What happens in part (a) if £ contains 3 or more symbols? Prove your answer. 2.44 If A and B are languages, define A o B = {xy\ x £ A and y £ B and \x\ = |y|}. Show that if A and B are regular languages, then Ao B is a CFL. *2.45 Let A = {wtw n \ w,t £ {0,1}* and \w\ = |t|}. Prove that A is not a context-free language. SELECTED SOLUTIONS 2.3 (a) R,X,S,T; (b) a, b; (c) R; (d) Three strings in G are ab, ba, and aab; (e) Three strings not in G are a, b, and e; (f) False; (g) True; (h) False; ©True; (j)True; (k) False; (1) True; (m) True; (n) False; (o) L(G) consists of all strings over a and b that are not palindromes. 2.4 (a) 5 -> R1R1R1R (d) 5 -* 0 | 0S0 | OSl | ISO | 1S1 R -> OR | 1R | e 2.6 (a) S -> TaT (c) S -*TX T — > TT | aTb | bTa | a | e T -+ 0T0 | 1T1 | #X T generates all strings with at least as X — ► OX | IX | £ many a’s as b’s, and S forces an extra a. 2.7 (a) The PDA uses its stack to count the number of a’s minus the number of b’s. It enters an accepting state whenever this count is 0. In more detail, it operates as follows. The PDA scans across the input. If it sees a b and its top stack symbol is a a, it pops the stack. Similarly, if it scans a a and its top stack symbol is a b, it pops the stack. In all other cases, it pushes the input symbol onto the stack. After the PDA scans the input, if b is on top of the stack, it accepts. Otherwise it rejects. (c) The PDA scans across the input string and pushes every symbol it reads until it reads a #. If # is never encountered, it rejects. Then, the PDA skips over part of the input, nondeterministically deciding when to stop skipping. At that point, it compares the next input symbols with the symbols it pops off the stack. At any disagreement, or if the input finishes while the stack is nonempty, this branch of the computation rejects. If the stack becomes empty, the machine reads the rest of the input and accepts. SELECTED SOLUTIONS 133 2.8 Here is one derivation: (SENTENCE) =4> (NOUN-PHRASE) (VERB-PHRASE) =>• (CMPLX-NOUN) (VERB-PHRASE) =>• (cmplx-noun) (cmplx-verb) (prep-phrase) => (ARTICLE) (NOUN) (CMPLX-VERB) (PREP-PHRASE) => The boy (VERB) (NOUN-PHRASE) (PREP-PHRASE) => The boy (VERB) (NOUN-PHRASE)(PREP)(CMPLX-NOUN) =*■ The boy touches (NOUN-PHRASE) (PREP) (CMPLX-NOUN) =>• The boy touches (CMPLX-NOUN)(PREP)(CMPLX-NOUN) => The boy touches (ARTICLE) (NOUN) (PREP) (CMPLX-NOUN) =4> The boy touches the girl with (CMPLX-NOUN) => The boy touches the girl with (ARTICLE) (NOUN) => The boy touches the girl with the flower Here is another derivation: (SENTENCE) => (NOUN-PHRASE)(VERB-PHRASE) => (CMPLX-NOUN)(VERB-PHRASE) => (ARTICLE) (NOUN) (VERB-PHRASE) The boy (VERB-PHRASE) =t> The boy (CMPLX-VERB) => The boy (VERB)(NOUN-PHRASE) => The boy touches (NOUN-PHRASE) => The boy touches (CMPLX-NOUN) (PREP-PHRASE) => The boy touches (ARTICLE)(NOUN) (PREP-PHRASE) => The boy touches the girl (PREP-PHRASE) => The boy touches the girl (PREP) (CMPLX-NOUN) => The boy touches the girl with (CMPLX-NOUN) => The boy touches the girl with (ARTICLE) (NOUN) => The boy touches the girl with the flower Each of these derivations corresponds to a different English meaning. In the first derivation, the sentence means that the boy used the flower to touch the girl. In the second derivation, the girl is holding the flower when the boy touches her. 2.18 (a) Let C be a context-free language and R be a regular language. Let P be the PDA that recognizes C, and D be the DFA that recognizes R. If Q is the set of states of P and Q' is the set of states of D, we construct a PDA P' that recognizes C <~\ R with the set of states Q x Q'. P' will do what P does and also keep track of the states of D. It accepts a string w if and only if it stops at a state q £ Fp x Fd, where Fp is the set of accept states of P and Fd is the set of accept states of D. Since C Pi R is recognized by P', it is context free. (b) Let R be the regular language a*b*c*. If A were a CFL then A n R would be a CFL by part (a). However, An R = {a"b n c"| n > 0}, and Example 2.36 proves that A f! R is not context free. Thus A is not a CFL. 2.30 (b) Let B = {0 n #0 2 ”#0 3n [ n > 0}. Let p be the pumping length given by the pumping lemma. Let s = 0 p #0 2p #0 3p . We show that s = uvxyz cannot be pumped. Neither v nor y can contain #, otherwise xv 2 wy 2 z contains more than two #s. Therefore, if we divide s into three segments by # s: 0 P , 0 2p , and 0 3p , at least one of the segments is not contained within either v or y. Hence xv 2 wy' 2 z is not in B because the 1:2:3 length ratio of the segments is not maintained. 134 CHAPTER 2 / CONTEXT-FREE LANGUAGES (c) Let C = w is a substring of t , where w,t £ {a.b}*}. Let p be the pumping length given by the pumping lemma. Let s = a p b p #a p b p . We show that the string s = uuxyz cannot be pumped. Neither v nor y can contain #, otherwise uv°xy°z does not contain # and therefore is not in C. If both v and y are nonempty and occur on the left-hand side of the #, the string uv 2 xy 2 z cannot be in C because it is longer on the left-hand side of the #. Similarly, if both strings occur on the right-hand side of the #, the string uv°xy°z cannot be in C because it is again longer on the left-hand side of the #. If only one of v and y is nonempty (both cannot be nonempty), treat them as if both occurred on the same side of the # as above. The only remaining case is where both v and y are nonempty and straddle the #. But then v consists of b’s and y consists of a’s because of the third pumping lemma condition \vxy\ < p. Hence, uv 2 xy 2 z contains more b’s on the left-hand side of the #, so it cannot be a member of C. 2.38 Let A be the language {0*T fc | k > 0} and let B be the language {a fc b 3i; | k > 0}. The perfect shuffle of A and B is the language C = {(0a) fc (0b)*(lb) 2fe | k > 0}. Languages A and B are easily seen to be CFLs, but C is not a CFL, as follows. If C were a CFL, let p be the pumping length given by the pumping lemma, and let s be the string (0a) p (0b) p (lb) 2p . Because s is longer than p and s £ C, we can divide s = uvxyz satisfying the pumping lemma’s three conditions. Strings in C contain twice as many Is as a’s. In order for uv 2 xy 2 z to have that property, the string vxy must contain both Is and a’s. But that is impossible, because they are separated by 2p symbols yet the third condition says that \vxy\ < p. Hence C is not context free. EXERCISES 159 Now M scans the list of edges. For each edge, M tests whether the two underlined nodes n\ and «2 are the ones appearing in that edge. If they are, M dots m, removes the underlines, and goes on from the beginning of stage 2. If they aren’t, M checks the next edge on the list. If there are no more edges, { r> [. n -2 } is not an edge of G. Then M moves the underline on n -2 to the next dotted node and now calls this node r? 2 . It repeats the steps in this paragraph to check, as before, whether the new pair {rii,ri 2 } is an edge. If there are no more dotted nodes, n i is not attached to any dotted nodes. Then M sets the underlines so that n\ is the next undotted node and n 2 is the first dotted node and repeats the steps in this paragraph. If there are no more undotted nodes, M has not been able to find any new nodes to dot, so it moves on to stage 4. For stage 4, M scans the list of nodes to determine whether all are dotted. If they are, it enters the accept state; otherwise it enters the reject state. This completes the description of TM M. t r m 9 . B: * » :fc 5 a w * * * X .£> A :» if 4ft S' S- * IS » SSI a till Bl EXERCISES 3.1 This exercise concerns TM M 2 whose description and state diagram appear in Ex¬ ample 3.7. In each of the parts, give the sequence of configurations that M 2 enters when started on the indicated input string. a. 0. A b. 00. c. 000. d. 000000. 3.2 This exercise concerns TM Mi whose description and state diagram appear in Ex¬ ample 3.9. In each of the parts, give the sequence of configurations that Mi enters when started on the indicated input string. A a. 11. b. 1 # 1 . c. 1##1. d. 10#11. e. 10#10. a 3.3 Modify the proof of Theorem 3.16 to obtain Corollary 3.19, showing that a lan¬ guage is decidable iff some nondeterministic Turing machine decides it. (You may assume the following theorem about trees. If every node in a tree has finitely many children and every branch of the tree has finitely many nodes, the tree itself has finitely many nodes.) 3.4 Give a formal definition of an enumerator. Consider it to be a type of two-tape Turing machine that uses its second tape as the printer. Include a definition of the enumerated language. 160 CHAPTER 3 / THE CHURCH—TURING THESIS a 3.5 Examine the formal definition of a Turing machine to answer the following ques¬ tions, and explain your reasoning. a. Can a Turing machine ever write the blank symbol u on its tape? b. Can the tape alphabet E be the same as the input alphabet E? c. Can a Turing machine’s head ever be in the same location in two successive steps? d. Can a Turing machine contain just a single state? 3.6 In Theorem 3.21 we showed that a language is Turing-recognizable iff some enu¬ merator enumerates it. Why didn’t we use the following simpler algorithm for the forward direction of the proof? As before, si, S 2 ,... is a list of all strings in E*. E = “Ignore the input. 1. Repeat the following for i = 1,2,3,... 2. Run M on s*. 3. If it accepts, print out «i.” 3.7 Explain why the following is not a description of a legitimate Turing machine. = “The input is a polynomial p over variables x-i , ..., Xk- 1. Try all possible settings of xi, ..., Xk to integer values. 2. Evaluate p on all of these settings. 3. If any of these settings evaluates to 0, accept ; otherwise, reject.” 3.8 Give implementation-level descriptions of Turing machines that decide the follow¬ ing languages over the alphabet {0,1}. A a. {m>| w contains an equal number of Os and Is} b. {u>| w contains twice as many Os as Is} c. {tc| w does not contain twice as many Os as Is} PROBLEMS 3.9 Let a A’-PDA be a pushdown automaton that has k stacks. Thus a 0-PDA is an NFA and a 1-PDA is a conventional PDA. You already know that 1-PDAs are more powerful (recognize a larger class of languages) than 0-PDAs. a. Show that 2-PDAs are more powerful than I-PDAs. b. Show that 3-PDAs are not more powerful than 2-PDAs. (Hint: Simulate a Turing machine tape with two stacks.) A 3.10 Say that a write-once Turing machine is a single-tape TM that can alter each tape square at most once (including the input portion of the tape). Show that this variant Turing machine model is equivalent to the ordinary Turing machine model. (Hint: As a first step consider the case whereby the Turing machine may alter each tape square at most twice. Use lots of tape.) PROBLEMS 161 3.11 A Turing machine with doubly infinite tape is similar to an ordinary Turing ma¬ chine, but its tape is infinite to the left as well as to the right. The tape is initially filled with blanks except for the portion that contains the input. Computation is defined as usual except that the head never encounters an end to the tape as it moves leftward. Show that this type of Turing machine recognizes the class of Turing-recognizable languages. 3.12 A Turing machine with left reset is similar to an ordinary Turing machine, but the transition function has the form 6: Q x r —>Q X r x {R, RESET}. If 5(q, a) = (r, b, RESET), when the machine is in state q reading an a, the ma¬ chine’s head jumps to the left-hand end of the tape after it writes b on the tape and enters state r. Note that these machines do not have the usual ability to move the head one symbol left. Show that Turing machines with left reset recognize the class of Turing-recognizable languages. 3.13 A Turing machine with stay put instead of left is similar to an ordinary Turing machine, but the transition function has the form 6: Q x r —>Q x r x {R, S}. At each point the machine can move its head right or let it stay in the same position. Show that this Turing machine variant is not equivalent to the usual version. What class of languages do these machines recognize? 3.14 A queue automaton is like a push-down automaton except that the stack is replaced by a queue. A queue is a tape allowing symbols to be written only on the left-hand end and read only at the right-hand end. Each write operation (we’ll call it a push ) adds a symbol to the left-hand end of the queue and each read operation (we’ll call it a pull) reads and removes a symbol at the right-hand end. As with a PDA, the input is placed on a separate read-only input tape, and the head on the input tape can move only from left to right. The input tape contains a cell with a blank symbol following the input, so that the end of the input can be detected. A queue automaton accepts its input by entering a special accept state at any time. Show that a language can be recognized by a deterministic queue automaton iff the language is Turing-recognizable. 3.15 Show that the collection of decidable languages is closed under the operation of A a. union. d. complementation. b. concatenation. e. intersection. c. star. 3.16 Show that the collection of Turing-recognizable languages is closed under the op¬ eration of A a. union. c. star. b. concatenation. d. intersection. *3.17 Let B = {(Mi), (M 2 ), ... } be a Turing-recognizable language consisting of TM descriptions. Show that there is a decidable language C consisting of TM descrip¬ tions such that every machine described in B has an equivalent machine in C and vice versa. 162 CHAPTER 3 / THE CHURCH-TURING THESIS 3.18 Show that a language is decidable iff some enumerator enumerates the language in lexicographic order. 3.19 Show that every infinite Turing-recognizable language has an infinite decidable subset. 3.20 Show that single-tape TMs that cannot write on the portion of the tape containing the input string recognize only regular languages. 3.21 Let c.\x n + c 2 x " _1 + ••■-(- c„x + c„+i be a polynomial with a root at x = io. Let c m ax be the largest absolute value of a Ci. Show that |x 0 |<(n + l)S=5i. 3.22 Let A be the language containing only the single string s, where f 0 if life never will be found on Mars. 1 if life will be found on Mars someday. Is A decidable? Why or why not? For the purposes of this problem, assume that the question of whether life will be found on Mars has an unambiguous YES or NO answer. m SELECTED SOLUTIONS 3.1 (b) qiOO, uq 2 0, ux<? 3 Li, uq 5 xu, <j 5 uxu, uq 2 xu, ux<j 2 u, uxugaccept 3.2 (a) q\ 11, X53I, xlg 3 u, xlug re j ect . 3.3 We prove both directions of the “iff.” First, if a language L is decidable, it can be decided by a deterministic Turing machine, and that is automatically a nondeter- ministic Turing machine. Second, if a language L is decided by a nondeterministic TM N, we construct a deterministic TM D 2 that decides L. Machine D 2 runs the same algorithm that appears in the TM D described in the proof of Theorem 3.16, with an additional Stage 5: Reject if all branches of the nondeterminism of N are exhausted. We argue that Z? 2 is a decider for L. If N accepts its input, Z? 2 will eventually find an accepting branch and accept, too. If N rejects its input, all of its branches halt and reject because it is a decider. Hence each of the branches has finitely many nodes, where each node represents one step of N’s computation along that branch. Therefore N’s entire computation tree on this input is finite, by virtue of the theorem about trees given in the statement of the exercise. Consequently D will halt and reject when this entire tree has been explored. 3.3 (a) Yes. The tape alphabet T contains u. A Turing machine can write any characters in r on its tape. (b) No. E never contains u, but L always contains u. So they cannot be equal. (c) Yes. If the Turing machine attempts to move its head off the left-hand end of the tape, it remains on the same tape cell. (d) No. Any Turing machine must contain two distinct states Accept and Reject- So, a Turing machine contains at least two states. SELECTED SOLUTIONS 163 3.8 (a) “On input string w: 1. Scan the tape and mark the first 0 which has not been marked. If no unmarked 0 is found, go to stage 4. Otherwise, move the head back to the front of the tape. 2. Scan the tape and mark the first 1 which has not been marked. If no unmarked 1 is found, reject. 3. Move the head back to the front of the tape and go to stage 1. 4. Move the head back to the front of the tape. Scan the tape to see if any unmarked Is remain. If none are found, accept -, otherwise, reject.” 3.10 We first simulate an ordinary Turing machine by a write-twice Turing machine. The write-twice machine simulates a single step of the original machine by copying the entire tape over to a fresh portion of the tape to the right-hand side of the currently used portion. The copying procedure operates character by character, marking a character as it is copied. This procedure alters each tape square twice, once to write the character for the first time and again to mark that it has been copied. The position of the original Turing machine’s tape head is marked on the tape. When copying the cells at, or adjacent to, the marked position, the tape contents is updated according to the rules of the original Turing machine. To carry out the simulation with a write-once machine, operate as before, except that each cell of the previous tape is now represented by two cells. The first of these contains the original machine’s tape symbol and the second is for the mark used in the copying procedure. The input is not presented to the machine in the format with two cells per symbol, so the very first time the tape is copied, the copying marks are put directly over the input symbols. 3.15 (a) For any two decidable languages L\ and Lo, let M\ and M 2 be the TMs that decide them. We construct a TM M' that decides the union of L\ and L 2 : “On input w: 1. Run Mi on w. If it accepts, accept. 2. Run A /2 on w. If it accepts, accept. Otherwise, reject.” M' accepts w if either A/i or M 2 accepts it. If both reject, M' rejects. 3.16 (a) For any two Turing-recognizable languages Li and L 2 , let Mi and M 2 be the TMs that recognize them. We construct a TM A/' that recognizes the union of L\ and L 2 : “On input w: I. Run Mi and M 2 alternatively on w step by step. If either accept, accept. If both halt and reject, reject.” If either Mi and M 2 accept w, M' accepts w because the accepting TM arrives to its accepting state after a finite number of steps. Note that if both Mi and M 2 reject and either of them does so by looping, then M' will loop. 3.22 The language A is one of the two languages, {0} or {1}. In either case the language is finite, and hence decidable. If you aren’t able to determine which of these two languages is A, you won’t be able to describe the decider for A, but you can give two Turing machines, one of which is A’s decider. 1 82 CHAPTER 4 / DECIDABILITY machine M is a decider for A. M = “On input w. 1. Run both Mi and M 2 on input w in parallel. 2. If Mi accepts, accept-, if M 2 accepts, reject .” Running the two machines in parallel means that M has two tapes, one for simu¬ lating Mi and the other for simulating M 2 . In this case M takes turns simulating one step of each machine, which continues until one of them accepts. Now we show that M decides A. Every string w is either in A or A. Therefore either M\ or M 2 must accept w. Because M halts whenever Mi or M 2 accepts, M always halts and so it is a decider. Furthermore, it accepts all strings in A and rejects all strings not in A. So M is a decider for A, and thus A is decidable. COROLLARY 4.23 . Ajm is not Turing-recognizable. PROOF We know that Aju is Turing-recognizable. If 4 jm also were Turing- recognizable, Atm would be decidable. Theorem 4.11 tells us that Atm is not decidable, so A T m must not be Turing-recognizable. is 81 Hf. Oft M ¥s j . EXERCISES a 4.I Answer all parts for the following DFA M and give reasons for your answers. a. Is <M,0100) € A dfa ? b. Is (M,011) g A D fa? c. Is ( M) € Aqfa? d. Is (M,0100> 6 Ar EX ? e. Is (M) 6 I?dfa? f. Is (M,M) € EQ dfa } PROBLEMS 183 4.2 Consider the problem of determining whether a DFA and a regular expression are equivalent. Express this problem as a language and show that it is decidable. 4.3 Let ALLdfa = {{A)| A is a DFA and L(.4) = E*}. Show that ALLdfa is decidable. 4.4 Let Aecfg = {(G)! G is a CFG that generates e}. Show that Aec fg is decidable. 4.5 Let X be the set {1,2, 3, 4, 5} and Y be the set {6, 7, 8, 9,10}. We describe the functions /: X —> Y and g: X —» Y in the following tables. Answer each part and give a reason for each negative answer. n f( n ) n g(n) i 6 i 10 2 7 2 9 3 6 3 8 4 7 4 7 5 6 5 6 k a. Is / one-to-one? b. Is / onto? c. Is / a correspondence? d. Is g one-to-one? e. Is g onto? f. Is g a correspondence? 4.6 Let B be the set of all infinite sequences over {0,1}. Show that B is uncountable, using a proof by diagonalization. 4.7 Let T = {( i,j , k) \ i,j, k € A/"}. Show that T is countable. 4.8 Review'the way that we define sets to be the same size in Definition 4.12 (page 175). Show that “is the same size” is an equivalence relation. » a PROBLEMS a 4.9 Let INFINITEdfa = {(A)| A is a DFA and L(A) is an infinite language}. Show that INFINITEdfa is decidable. 4.10 Let INFINITEpda = {(M)| M is a PDA and L(M ) is an infinite language}. Show that INFINITEpda is decidable. a 4.11 Let A = {{M)\ M is a DFA which doesn’t accept any string containing an odd number of Is}. Show that A is decidable. 4.12 Let A = {{R, S)| R and S are regular expressions and L(R) C L(S)}. Show that A is decidable. a 4.13 Let E = {0,1}. Show that the problem of determining whether a CFG generates some string in 1* is decidable. In other words, show that {{<7)1 G is a CFG over {0,1} and 1* n L(G) / 0} is a decidable language. *4.14 Show that the problem of determining whether a CFG generates all strings in 1* is decidable. In other words, show that {{G)| G is a CFG over {0,1} and 1* C L(G)} is a decidable language. 184 CHAPTER 4 / DECIDABILITY 4.15 Let A — {(£)[ R is a regular expression describing a language containing at least one string vj that has 111 as a substring (i.e., w = xllly for some x and y)}. Show that A is decidable. 4.16 Prove that Z?Q DFA is decidable by testing the two DFAs on all strings up to a certain size. Calculate a size that works. *4.17 Let C be a language. Prove that C is Turing-recognizable iff a decidable language D exists such that C = {x| 3 y ({x,y) G £>)}. 4.18 Let A and B be two disjoint languages. Say that language C separates A and B if ACC and B C C. Show that any two disjoint co-Turing-recognizable languages are separable by some decidable language. 4.19 LetS 1 = {(M) \ M is a DFA that accepts w w whenever it accepts w}. Show that S is decidable. 4.20 A language is prefix-free if no member is a proper prefix of another member. Let PREFIX-FREErex = { R] R is a regular expression where L(R) is prefix-free}. Show that PREFIX-FREErex is decidable. Why does a similar approach fail to show that PREFIX-FREEcfg is decidable? A *4.21 Say that an NFA is ambiguous if it accepts some string along two different com¬ putation branches. Let AMBIGnfa = {(A')| N is an ambiguous NFA}. Show that AMBIGnfa is decidable. (Suggestion: One elegant way to solve this problem is to construct a suitable DFA and then run £dfa on it.) 4.22 A useless state in a pushdown automaton is never entered on any input string. Con¬ sider the problem of determining whether a pushdown automaton has any useless states. Formulate this problem as a language and show that it is decidable. A *4.23 Let BALqfa = {(M) | M is a DFA that accepts some string containing an equal number of Os and Is}. Show that BALdfa is decidable. (Hint: Theorems about CFLs are helpful here.) *4.24 Let PALdfa = {(A/) | M is a DFA that accepts some palindrome}. Show that PAL dfa is decidable. (Hint: Theorems about CFLs are helpful here.) *4.25 Let E = {(A/) M is a DFA that accepts some string with more Is than Os}. Show that E is decidable. (Hint: Theorems about CFLs are helpful here.) 4.26 Let C = { [G, :r)| G is a CFG that generates some string w, where a: is a substring of tt)}. Show that G is decidable. (Suggestion: An elegant solution to this problem uses the decider for Ecfg-) 4.27 Let Ccfg = {{C, k) | L(G) contains exactly k strings where k > 0 or fc = oo}. Show that Ccfg is decidable. 4.28 Let A be a Turing-recognizable language consisting of descriptions of Turing ma¬ chines, {(Mi), (M 2 ), • • • }, where every Mi is a decider. Prove that some decidable language D is not decided by any decider M, whose description appears in A. (Hint: You may find it helpful to consider an enumerator for A.) SELECTED SOLUTIONS 185 SELECTED SOLUTIONS 4.1 (a) Yes. The DFA M accepts 0100. (b) No. M doesn’t accept Oil. (c) No. This input has only a single component and thus is not of the correct form. <d) No. The first component is not a regular expression and so the input is not of the correct form. (e) No. M’s language isn’t empty. (f) Yes. M accepts the same language as itself. 4.5 (a) No, / is not one-to-one because /(1) — /(3). (d) Yes, g is one-to-one. 4.9 The following TM I decides INFINITE DfA . I = “On input (A) where A is a DFA: 1. Let A; be the number of states of A. 2. Construct a DFA D that accepts all strings of length k or more. 3. Construct a DFA M such that L(M) = L(A) H L(D). 4. Test L(M) = 0, using the Edfa decider T from Theorem 4.4. 5. IfT accepts, reject ; ifT rejects, accept.” This algorithm works because a DFA which accepts infinitely many strings must accept arbitrarily long strings. Therefore this algorithm accepts such DFAs. Con¬ versely, if the algorithm accepts a DFA, the DFA accepts some string of length k or more, where k is the number of states of the DFA. This string may be pumped in the manner of the pumping lemma for regular languages to obtain infinitely many accepted strings. 4.11 The following TM decides A. “On input (M): 1. Construct a DFA O that accepts every string containing an odd number of Is. 2. Construct DFA B such that L(B ) = L(M ) D L(O). 3. Test whether L(B) — 0, using the £dfa decider T from Theo¬ rem 4.4. 4. If T accepts, accept ; if T rejects, reject.” 4.13 You showed in Problem 2.18 that, if C is a context-free language and R is a regular language, then C fl R is context free. Therefore 1* n L(G) is context free. The following TM decides A. “On input (G): 1. Construct CFG H such that L(H) = 1* n L(G). 2. Test whether L(H) = 0, using the A'cfg decider R from Theo¬ rem 4.8. 3. If R accepts, reject-, if R rejects, accept.” 186 CHAPTER 4 / DECIDABILITY 4.21 The following procedure decides TMRIGnfa- Given an NFA N, we design a DFA D that simulates N and accepts a string iff it is accepted by N along two different computational branches. Then we use a decider for -Eofa to determine whether D accepts any strings. Our strategy for constructing D is similar to the NFA to DFA conversion in the proof of Theorem 1.39. We simulate N by keeping a pebble on each active state. We begin by putting a red pebble on the start state and on each state reachable from the start along e transitions. We move, add, and remove pebbles in accordance with N’s transitions, preserving the color of the pebbles. Whenever two or more pebbles are moved to the same state, we replace its pebbles with a blue pebble. After reading the input, we accept if a blue pebble is on an accept states of N. The DFA D has a state corresponding to each possible position of pebbles. For each state of N, three possibilities occur: it can contain a red pebble, a blue pebble, or no pebble. Thus, if N has n states, D will have 3 n states. Its start state, accept states, and transition function are defined to carry out the simulation. 4.23 The language of all strings with an equal number of Os and Is is a context-free language, generated by the grammar S —> ISOS | OSlS | e. Let P be the PDA that recognizes this language. Build a TM M for BALdfa, which operates as follows. On input { B), where B is a DFA, use B and P to construct a new PDA R that recognizes the intersection of the languages of B and P. Then test whether R’s language is empty. If its language is empty, reject; otherwise, accept. EXERCISES 211 « m » up m a EXERCISES 5.1 Show that EQ C¥G is undecidable. 5.2 Show that EQ CfG is co-Turing-recognizable. 5.3 Find a match in the following instance of the Post Correspondence Problem. 5.4 If A < m B and B is a regular language, does that imply that A is a regular lan¬ guage? Why or why not? a 5.5 Show that Atm is not mapping reducible to Ejm- In other words, show that no computable function reduces ,4 T m to Ejm- (Hint: Use a proof by contradiction, and facts you already know about Atm and Ujm ■) a 5.6 Show that < m is a transitive relation. a 5.7 Show that if A is Turing-recognizable and A < m A, then A is decidable. a 5.8 In the proof of Theorem 5.15 we modified the 'luring machine M so that it never tries to move its head off the left-hand end of the tape. Suppose that we did not make this modification to M. Modify the PCP construction to handle this case. PROBLEMS 5,9 Let T = {(M) \ M is a TM that accepts w R whenever it accepts w }. Show that T is undecidable. 5.10 Consider the problem of determining whether a two-tape Turing machine ever writes a nonblank symbol on its second tape when it is run on input w. Formulate this problem as a language, and show that it is undecidable. 5.11 Consider the problem of determining whether a two-tape Turing machine ever writes a nonblank symbol on its second tape during the course of its computation on any input string. Formulate this problem as a language, and show that it is undecidable. 5.12 Consider the problem of determining whether a single-tape Turing machine ever writes a blank symbol over a nonblank symbol during the course of its computation on any input string. Formulate this problem as a language, and show that it is undecidable. 5.13 A useless state in a Turing machine is one that is never entered on any input string. Consider the problem of determining whether a Turing machine has any useless states. Formulate this problem as a language and show that it is undecidable. 212 CHAPTER S / REDUCIBILITY 5.14 Consider the problem of determining whether a Turing machine M on an input w ever attempts to move its head left when its head is on the left-most tape cell. Formulate this problem as a language and show that it is undecidable. 5.15 Consider the problem of determining whether a Turing machine M on an input w ever attempts to move its head left at any point during its computation on w. Formulate this problem as a language and show that it is decidable. 5.16 Let r = {0,1, u} be the tape alphabet for all TMs in this problem. Define the busy beaver function Bil: M —* J\f as follows. For each value of k, consider all fc-state TMs that halt when started with a blank tape. Let BB(k) be the maximum number of Is that remain on the tape among all of these machines. Show that BB is not a computable function. 5.17 Show that the Post Correspondence Problem is decidable over the unary alphabet £ = { 1 }. 5.18 Show that the Post Correspondence Problem is undecidable over the binary alpha¬ bet E = {0,1}. 5.19 In the silly Post Correspondence Problem, SPCP, in each pair the top string has the same length as the bottom string. Show that the SPCP is decidable. 5.20 Prove that there exists an undecidable subset of {l}*. 5.21 Let AMBIGqfq = {(G)| G is an ambiguous CFG}. Show that AMBIGcfq is unde¬ cidable. (Hint: Use a reduction from PCP. Given an instance of the Post Correspondence Problem, construct a CFG G with the rules S -> T | B T —> tiTai | ••• | t k T& k | Gai | ••• | t k a, k B biBai | • • • | b k Ba k | t>iai | • • • | b k a k , where ai, ... .a k are new terminal symbols. Prove that this reduction works.) 5.22 Show that A is Turing-recognizable iff A < m Atm- 5.23 Show that A is decidable iff A < m 0*1*. 5.24 Let J — { w\ either w = Ox for some x 6 Atm, or w = 1 y for some y € Atm }• Show that neither J nor J is Turing-recognizable. 5.25 Give an example of an undecidable language B, where B <„, B. 5.26 Define a two-headed finite automaton (2DFA) to be a deterministic finite automa¬ ton that has two read-only, bidirectional heads that start at the left-hand end of the input tape and can be independently controlled to move in either direction. The tape of a 2DFA is finite and is just large enough to contain the input plus two ad¬ ditional blank tape cells, one on the left-hand end and one on the right-hand end, that serve as delimiters. A 2DFA accepts its input by entering a special accept state. For example, a 2DFA can recognize the language {a n b' i c n | n > 0}. a. Let A 2 dfa = {(AT, a:)| M is a 2DFA and M accepts x}. Show that A 2 dfa is decidable. b. Let -E 2 DFA = {(AT)| M is a 2DFA and L(M) = 0}. Show that T? 2 dfa is not decidable. PROBLEMS 213 5.27 A two-dimensional finite automaton (2DIM-DFA) is defined as follows. The input is an m x n rectangle, for any m,n > 2. The squares along the boundary of the rectangle contain the symbol # and the internal squares contain symbols over the input alphabet E. The transition function is a mapping Q x E —► Q x {L, R, U, D} to indicate the next state and the new head position (Left, Right, Up, Down). The machine accepts when it enters one of the designated accept states. It rejects if it tries to move off the input rectangle or if it never halts. Two such machines are equivalent if they accept the same rectangles. Consider the problem of determin¬ ing whether two of these machines are equivalent. Formulate this problem as a language, and show that it is undecidable. a ‘ 5.28 Rice’s theorem. Let P be any nontrivial property of the language of a Turing machine. Prove that the problem of determining whether a given Turing machine’s language has property P is undecidable. In more formal terms, let P be a language consisting of Turing machine descrip¬ tions where P fulfills two conditions. First, P is nontrivial—it contains some, but not all, TM descriptions. Second, P is a property of the TM’s language—whenever L(Mi) = L(M 2 ), we have (Mi) € P iff (M 2 ) € P. Here, Mi and M 2 are any TMs. Prove that P is an undecidable language. 5.29 Show that both conditions in Problem 5.28 are necessary for proving that P is undecidable. 5.30 Use Rice’s theorem, which appears in Problem 5.28, to prove the undecidability of each of the following languages. A a. INFINITEju = {(M)\ M is a TM and L(M) is an infinite language}. b. {(M)\ M is a TM and 1011 € L(M)}. c. TLLtm = {(M)| M is a TM and L(M) = E*}. 5.31 Let f(x) = j 3.r : 1 \*/2 for odd x for even x for any natural number x. If you start with an integer x and iterate /, you obtain a sequence, x, f{x), /(/(*)), • ■. Stop if you ever hit 1. For example, if x = 17, you get the sequence 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1. Extensive computer tests have shown that every starting point between 1 and a large positive integer gives a sequence that ends in 1. But, the question of whether all positive starting points end up at 1 is unsolved; it is called the 3x + 1 problem. Suppose that Atm were decidable by a TM H. Use H to describe a TM that is guaranteed to state the answer to the 3x + 1 problem. 5.32 Prove that the following two languages are undecidable. a. OVEKLiPcfo = {(G, H)\ G and H are CFGs where L(G) n L(H) 0}. (Hint: Adapt the hint in Problem 5.21.) b. PREFIX-FREEqfg = {G\ G is a CFG where LAG) is prefix-free}. 5.33 Let S = {(AT)| M is a TM and L(M) = {(M)} }. Show that neither S nor S is Turing-recognizable. 5.34 Consider the problem of determining whether a PDA accepts some string of the form {wwj w G {0,1}*} . Use the computation history method to show that this problem is undecidable. 214 CHAPTER 5 / REDUCIB1LITY 5.35 Let X = {{ M , w)| M is a single-tape TM that never modifies the portion of the tape that contains the input tu}. Is X decidable? Prove your answer. SELECTED SOLUTIONS 5.5 Suppose for a contradiction that A T m < m Ejm via reduction /. It follows from the definition of mapping reducibility that A T m < m £tm via the same reduction func¬ tion /. However Ejm is Turing-recognizable and Atm is not Turing-recognizable, contradicting Theorem 5.28. 5.6 Suppose A < m B and B < m C. Then there are computable functions / and g such that x € A <=> f(x) 6 B and y £ B <=>■ g(y) 6 C. Consider the composition function h(x) = g(f(x)). We can build a TM that computes h as follows: First, simulate a TM for / (such a TM exists because we assumed that / is computable) on input x and call the output y. Then simulate a TM for g on y. The output is h(x) — g(f(x)). Therefore h is a computable function. Moreover, x € A <=i> h(x) 6 C. Hence A < m C via the reduction function h. 5.7 Suppose that A < m A. Then A <„, A via the same mapping reduction. Because A is Turing-recognizable, Theorem 5.28 implies that A is Turing-recognizable, and then Theorem 4.22 implies that A is decidable. 5.8 You need to handle the case where the head is at the leftmost tape cell and attempts to move left. To do so add dominos ' #qa ' . #r&_ for every q,r € Q and a, b € T, where 5(q, a) = (r, b, L). 5.10 Let B — {{M,w)\ M is a two-tape TM which writes a nonblank symbol on its second tape when it is run on w}. Show that .4 jm reduces to B. Assume for the sake of contradiction that TM R decides B. Then construct TM S that uses R to decide Atm- S = “On input (M, w): 1. Use M to construct the following two-tape TM T. T = “On input x: 1. Simulate M on x using the first tape. 2. If the simulation shows that M accepts, write a non¬ blank symbol on the second tape.” 2. Run R on ( T,w ) to determine whether T on input w writes a nonblank symbol on its second tape. 3. If R accepts, M accepts w, therefore accept. Otherwise reject.” SELECTED SOLUTIONS 215 5.11 Let C = { (M) \ M is a two-tape TM which writes a nonblank symbol on its second tape when it is run on some input}. Show that Atm reduces to C. Assume for the sake of contradiction that TM R decides C. Construct TM S that uses R to decide Atm- S = “On input (M, to): 1. Use M and to to construct the following two-tape TM T,„. T w = “On any input: 1. Simulate M on w using the first tape. 2. If the simulation shows that M accepts, write a non¬ blank symbol on the second tape.” 2. Run R on (T,„) to determine whether T u , ever writes a nonblank symbol on its second tape. 3. If R accepts, M accepts to, therefore accept. Otherwise reject.” 5.28 Assume for the sake of contradiction that P is a decidable language satisfying the properties and let Rp be a TM that decides P. We show how to decide Atm using Rp by constructing TM S. First let Tjj be a TM that always rejects, so L(T$) = 0. You may assume that (Tq,) P without loss of generality, because you could pro¬ ceed with P instead of P if (Tjj) 6 P. Because P is not trivial, there exists a TM T with (T) 6 P- Design S to decide A T m using Rp’ s ability to distinguish between Tu and T. S = “On input (M, to): 1. Use M and w to construct the following TM M w . M w = “On input x: 1. Simulate M on w. If it halts and rejects, reject. If it accepts, proceed to stage 2. 2. Simulate T on x. If it accepts, accept.” 2. Use TM Rp to determine whether (M w ) € P. If YES, accept. If NO, reject.” TM M w simulates T if M accepts w. Hence L(M W ) equals L(T) if M accepts w and 0 otherwise. Therefore (M, w) € P iff M accepts w. 5.30 (a) INFINITEjm is a language of TM descriptions. It satisfies the two conditions of Rice’s theorem. First, it is nontrivial because some TMs have infinite languages and others do not. Second, it depends only on the language. If two TMs recognize the same language, either both have descriptions in INFINITEtm or neither do. Consequently, Rice’s theorem implies that INFINITE tm is undecidable. 242 CHAPTER 6 / ADVANCED TOPICS IN COMPUTABILITY THEORY EXERCISES 6.1 Give an example in the spirit of the recursion theorem of a program in a real pro¬ gramming language (or a reasonable approximation thereof) that prints itself out. 6.2 Show that any infinite subset of MIN tm is not Turing-recognizable. a 6.3 Show that if A < T B and B <t C then A <x C. 6.4 Let Aju' — {(M, in)| M is an oracle TM and M A ™ accepts w}. Show that At n/ is undecidable relative to Atm- a 6.5 Is the statement 3.x iy [x+t/=j/l a member of Th(Af,+)? Why or why not? What about the statement 3x \y yx+y=x ]? PROBLEMS 6.6 Describe two different Turing machines, M and N, that, when started on any input, M outputs ( N ) and N outputs (M). 6.7 In the fixed-point version of the recursion theorem (Theorem 6.8) let the trans¬ formation t be a function that interchanges the states Accept and g re ject in Turing machine descriptions. Give an example of a fixed point for t. 6.8 Show that a 6.9 Use the recursion theorem to give an alternative proof of Rice’s theorem in Prob¬ lem 5.28. a 6.10 Give a model of the sentence 4>e q= Vx [fli(x,x)] /\Mx,y[Ri(x,y) <-► fti(t/,x)] A\/x,y,z[{Ri(x,y) A R!{y,z)) R x {x,z)\. *6.11 Let 0eq be defined as in Problem 6.10. Give a model of the sentence 0 It = 0eq A Vx,y [Hi(x,y) -> -nR 2 (x,y)] AVx,j/ [^R x {x,y) -► {R 2 {x,y)® R 2 (y,x))] /\Vx,y,z [( R 2 (x,y ) A R 2 {y,z)) -> R 2 (x,z)] AVx3 y [R 2 (x,y)\. a 6.12 Let (M, <) be the model with universe A’ and the “less than” relation. Show that Th(Af, <) is decidable. SELECTED SOLUTIONS 243 6.13 For each m > 1 let Z m = { 0 , 1 , 2 , . .., m — 1} and let T m = {Z m , +, x) be the model whose universe is Z m and that has relations corresponding to the + and x relations computed modulo m. Show that for each m the theory Th(P rn ) is decidable. 6.14 Show that for any two languages A and B a language J exists, where A <r J and B < T J. 6.15 Show that for any language A, a language B exists, where A <x B and B A. *6.16 Prove that there exist two languages A and B that are Turing-incomparable—that is, where A B and B A. *6.17 Let A and B be two disjoint languages. Say that language C separates A and B if ACC and B C C. Describe two disjoint Turing-recognizable languages that aren’t separable by any decidable language. 6.18 In Corollary 4.18 we showed that the set of all languages is uncountable. Use this result to prove that languages exist that are not recognizable by an oracle Turing machine with oracle for Atm ■ 6.19 Recall the Post correspondence problem that we defined in Section 5.2 and its associated language PCP. Show that PCP is decidable relative to Atm ■ 6.20 Show how to compute the descriptive complexity of strings K(x) with an oracle for Atm- 6.21 Use the result of Problem 6.20 to give a function / that is computable with an oracle for Atm, where for each n, f(n) is an incompressible string of length n. 6.22 Show that the function K(ar) is not a computable function. 6.23 Show that the set of incompressible strings is undecidable. 6.24 Show that the set of incompressible strings contains no infinite subset that is Turing-recognizable. ‘6.25 Show that for any c, some strings x and y exist, where K (xy) > K(x) + K(y) + c. SELECTED SOLUTIONS 6.3 Say that Mf decides A and Mp decides B. Use an oracle TM M 3 , where Mp decides A. Machine M 3 simulates Mi. Every time Mi queries its oracle about some string x, machine M 3 tests whether x £ B and provides the answer to Mi. Because machine M 3 doesn’t have an oracle for B and cannot perform that test directly, it simulates M2 on input x to obtain that information. Machine M3 can obtain the answer to M 2 ’s queries directly because these two machines use the same oracle, C. 6.5 The statement 3xVj/ [a;+j/=j/] is a member of Th(A/", +) because that statement is true for the standard interpretation of + over the universe AT. Recall that we use M = {0,1,2,...} in this chapter and so we may use x = 0. The statement 3x \/y [ x+y=x j is not a member of Th(A/", +) because that statement isn’t true in this model. For any value of x, setting y = 1 causes x-\-y=x to fail. 244 CHAPTER 6 / ADVANCED TOPICS IN COMPUTABILITY THEORY 6.9 Assume for the sake of contradiction that some TM X decides a property P, and P satisfies the conditions of Rice’s theorem. One of these conditions says that TMs A and B exist where (A) 6 P and (B) 0 P. Use A and B to construct TM R : R = “On input w: 1. Obtain own description (R) using the recursion theorem. 2. Run X on ( R). 3. If X accepts (R), simulate B on w. If X rejects (R), simulate A on w.” If (R) € P, then X accepts (R) and L(R) = L(B). But { B ) ^ P, contradicting (R) € P, because P agrees on TMs that have the same language. We arrive at a similar contradiction if {R) (p P. Therefore our original assumption is false. Every property satisfying the conditions of Rice’s theorem is undecidable. 6.10 The statement <p e q gives the three conditions of an equivalence relation. A model (A, R\), where A is any universe and Ri is any equivalence relation over A, is a model of tf> eq . For example, let A be the integers Z and let Ri = {(i, i)\i £ Z}. 6.12 Reduce Th(A/”, <) to Th(jV, +), which we’ve already shown to be decidable. To do so, show how to convert a sentence <p\ over the language of Th(Af, <), to a sentence <p 2 over the language of Th(A/”, +) while preserving truth or falsity in the respective models. Replace every occurrence off < j in cp\ by the formula 3k [ ( i+k=j ) A (k+k^k) ] in <po, where k is a different new variable each time. Sentence rp 2 is equivalent to p> 1 because “i is less than j” means that we can add a nonzero value to i and obtain j. Putting (j >2 into prenex-normal form, as re¬ quired by the algorithm for deciding Th(A/”, +), requires a bit of additional work. The new existential quantifiers are brought to the front of the sentence. To do so, these quantifiers must pass through Boolean operations that appear in the sen¬ tence. Quantifiers can be brought through the operations of A and V without change. Passing through changes 3 to V and vice-versa. Thus -3k ip becomes the equivalent expression Vfc -up, and -iVfc ip becomes 3k -up. 294 CHAPTER 7 / TIME COMPLEXITY yi or Zi for each i, but not both. Now we make the satisfying assignment. If the subset contains yi, we assign Xi TRUE; otherwise, we assign it FALSE. This assignment must satisfy 0 because in each of the final k columns the sum is always 3. In column cj, at most 2 can come from gj and hj, so at least 1 in this column must come from some y, or Zi in the subset. If it is yi, then ay appears in Cj and is assigned TRUE, so Cj is satisfied. If it is Zi, then xi appears in Cj and x, is assigned FALSE, so Cj is satisfied. Therefore 4> is satisfied. Finally, we must be sure that the reduction can be carried out in polynomial time. The table has a size of roughly (k + l) 2 , and each entry can be easily calculated for any cf). So the total time is 0(n 2 ) easy stages. EXERCISES 7.1 Answer each part TRUE or FALSE. a. 2 n = 0(n). b. n 2 = 0(n). A c. n 2 = 0(n log 2 n). 7.2 Answer each part TRUE or FALSE. a. n = o(2n ). b. 2 n = o(n 2 ). A c. 2 n = o(3 n ). A d. rtlogn = 0(n 2 ). e. 3 " = 2 ° (n) . f. 2 2 ” = 0 ( 2 2 "). A d. 1 = o(n). e. n = o(logn). f. 1 = o(l/n). 7.3 Which of the following pairs of numbers are relatively prime? Show the calcula¬ tions that led to your conclusions. a. 1274 and 10505 b. 7289 and 8029 7.4 Fill out the table described in the polynomial time algorithm for context-free lan¬ guage recognition from Theorem 7.16 for string w = baba and CFG G: S -> RT R -> TR . 1 a T -> TR | b 7.5 Is the following formula satisfiable? {x V y) A (xVy) A (x V y) A (xVy) 7.6 Show that P is closed under union, concatenation, and complement. PROBLEMS 295 7.7 Show that NP is closed under union and concatenation. 7.8 Let CONNECTED = {(G}| G is a connected undirected graph}. Analyze the algorithm given on page 157 to show that this language is in P. 7.9 A triangle in an undirected graph is a 3-clique. Show that TRIANGLE 6 P, where TRIANGLE = ((G)| G contains a triangle}. 7.10 Show that ALLdfa is in P. 7.11 Call graphs G and H isomorphic if the nodes of G may be reordered so that it is identical to H. Let ISO = {(G, H)\ G and H are isomorphic graphs}. Show that ISO € NP. PROBLEMS 7.12 Let MODEXP = { (a, b,c,p)\ a, b, c, and p are binary integers such that a b = c (mod p)}. Show that MODEXP € P. (Note that the most obvious algorithm doesn’t run in polynomial time. Hint: Try it first where b is a power of 2.) 7.13 A permutation on the set {1, ..., k} is a one-to-one, onto function on this set. When p is a permutation, p* means the composition of p with itself t times. Let PERM-POWER = {(p,q,t)\ P = <l' where p and q are permutations on {1, •. •, k} and t is a binary integer}. Show that PERM-POWER € P. (Note that the most obvious algorithm doesn’t run within polynomial time. Hint: First try it where t is a power of 2). 7.14 Show that P is closed under the star operation. (Hint: Use dynamic programming. On input y = yi ■ ■ ■ y n for y, € £, build a table indicating for each i < j whether the substring yi - ■ - yj € .4* for any A 6 P.) A 7.15 Show that NP is closed under the star operation. 7.16 Let UNARY-SSUM be the subset sum problem in which all numbers are repre¬ sented in unary. Why does the NP-completeness proof for SUBSET-SUM fail to show UNARY-SSUM is NP-complete? Show that UNARY-SSUM 6 P. 7.17 Show that, if P = NP, then every language A 6 P, except A = 0 and A = is NP-complete. *7.18 Show that PRIMES = {rn\m is a prime number in binary} g NP. (Hint: Forp > 1 the multiplicative group Zp = {x | x is relatively prime to p and 1 < x < p} is both cyclic and of order p — 1 iff p is prime. You may use this fact without justifying it. The stronger statement PRIMES € P is now known to be true, but it is more difficult to prove.) 7.19 We generally believe that PATH is not NP-complete. Explain the reason behind this belief. Show that proving PATH is not NP-complete would prove P A NP. 296 CHAPTER 7 / TIME COMPLEXITY 7.20 Let G represent an undirected graph. AJso let SPATH = {(G, a, b, k) \ G contains a simple path of length at most k from a to b}, and LPATH = f {(7, a, h, k) | G contains a simple path of length at least k from a to bj. a. Show that SPATH 6 P. b. Show that LPATH is NP-complete. You may assume the NP-completeness of UHAMPATH , the Hamiltonian path problem for undirected graphs. 7.21 Let DOUBLE-SAT = { (<j>)\ <f> has at least two satisfying assignments}. Show that DOUBLE-SAT is NP-complete. a 7.22 Let HALF-CLIQUE = {(G )| G is an undirected graph having a complete sub¬ graph with at least m/2 nodes, where m. is the number of nodes in G}. Show that HALF-CLIQUE is NP-complete. 7.23 Let CNFk = {(<t>) \ 4> is a satisfiable cnf-formula where each variable appears in at most k places}. a. Show that CNF 2 G P. b. Show that CNF 3 is NP-complete. 7.24 Let <j> be a 3cnf-formula. An ^-assignment to the variables of 4> is one where each clause contains two literals with unequal truth values. In other words, an ^-assignment satisfies cj> without assigning three true literals in any clause. a. Show that the negation of any ^-assignment to <f) is also an ^-assignment. b. Let /SAT be the collection of 3cnf-formulas that have an ^-assignment. Show that we obtain a polynomial time reduction from 3SAT to f '--SAT by replacing each clause c* (t/i V 2/2 V y 3 ) with the two clauses (y 1 V 2/2 V Zi) and (zj V y 3 V b), where z, is a new variable for each clause c< and b is a single additional new variable. c. Conclude that ^ SAT is NP-complete. 7.25 A cut in an undirected graph is a separation of the vertices V into two disjoint subsets S and T. The size of a cut is the number of edges that have one endpoint in S and the other in T. Let AL4X-CUT = {{G, k) | G has a cut of size k or more}. Show that MAX-CUT is NP-complete. You may assume the result of Prob¬ lem 7.24. (Hint: Show that j^SAT <p MAX-CUT. The variable gadget for variable a; is a collection of 3c nodes labeled with x and another 3c nodes labeled with x, where c is the number of clauses. All nodes labeled x are connected with all nodes labeled x. The clause gadget is a triangle of three edges connecting three nodes labeled with the literals appearing in the clause. Do not use the same node in more than one clause gadget. Prove that this reduction works.) PROBLEMS 297 7.26 You are given a box and a collection of cards as indicated in the following figure. Because of the pegs in the box and the notches in the cards, each card will fit in the box in either of two ways. Each card contains two columns of holes, some of which may not be punched out. The puzzle is solved by placing all the cards in the box so as to completely cover the bottom of the box, (i.e., every hole position is blocked by at least one card that has no hole there.) Let PUZZLE = {{ci, ..., Ck)\ each Cj represents a card and this collection of cards has a solution}. Show that PUZZLE is NP-complete. box- card one way other way n notch hole 7.27 A coloring of a graph is an assignment of colors to its nodes so that no two adjacent nodes are assigned the same color. Let 3COLOR = {(G) | the nodes of G can be colored with three colors such that no two nodes joined by an edge have the same color}. Show that 3COLOR is NP-complete. (Hint: Use the following three subgraphs.) o—o variable OR-gadget 7.28 Let SET-SPLITTING = {(5, C)\ S is a finite set and C = {Ci, ... ,C k } is a collection of subsets of S, for some k > 0, such that elements of S can be colored red or blue so that no Ci has all its elements colored with the same color.} Show that SET-SPLITTING is NP-complete. 7.29 Consider the following scheduling problem. You are given a list of final exams Ei,..., Fk to be scheduled, and a list of students Si,..., Si. Each student is taking some specified subset of these exams. You must schedule these exams into slots so that no student is required to take two exams in the same slot. The problem is to determine if such a schedule exists that uses only h slots. Formulate this problem as a language and show that this language is NP-complete. 298 CHAPTER 7 / TIME COMPLEXITY 7.30 This problem is inspired by the single-player game Minesweeper, generalized to an arbitrary graph. Let G be an undirected graph, where each node either contains a single, hidden mine or is empty. The player chooses nodes, one by one. If the player chooses a node containing a mine, the player loses. If the player chooses an empty node, the player learns the number of neighboring nodes containing mines. (A neighboring node is one connected to the chosen node by an edge.). The player wins if and when all empty nodes have been so chosen. In the mine consistency problem you are given a graph G, along with numbers labeling some of G’s nodes. You must determine whether a placement of mines on the remaining nodes is possible, so that any node v that is labeled m has exactly m neighboring nodes containing mines. Formulate this problem as a language and show that it is NP-complete. 7.31 In the following solitaire game, you are given an m x m board. On each of its n 2 positions lies either a blue stone, a red stone, or nothing at all. You play by removing stones from the board so that each column contains only stones of a sin¬ gle color and each row contains at least one stone. You win if you achieve this objective. Winning may or may not be possible, depending upon the initial con¬ figuration. Let SOLITAIRE = {{G)| G is a winnable game configuration}. Prove that SOLITAIRE is NP-complete. 7.32 Let U = {( M, x, # f )| TM M accepts input x within t steps on at least one branch}. Show that U is NP-complete. 7.33 Recall, in our discussion of the Churcb-Turing thesis, that we introduced the lan¬ guage D = { (p) | p is a polynomial in several variables having an integral root}. We stated, but didn’t prove, that D is undecidable. In this problem you are to prove a different property of D —namely, that D is NP-hard. A problem is NP-hard if all problems in NP are polynomial time reducible to it, even though it may not be in NP itself. So, you must show that all problems in NP are polynomial time reducible to D. 7.34 A subset of the nodes of a graph G is a dominating set if every other node of G is adjacent to some node in the subset. Let DOMINATING-SET = {(G, &)| G has a dominating set with k nodes}. Show that it is NP-complete by giving a reduction from VERTEX-COVER. 7.35 Show that the following problem is NP-complete. You are given a set of states Q = {qo, < 71 , ...,</(} and a collection of pairs {(si, n),..., (s*,, r*,)} where the Si are distinct strings over £ = { 0 , 1 }, and the r, are (not necessarily distinct) members of Q. Determine whether a DFA M = ( Q , £, 5, qo, F) exists where 8(qo, Si) = ri for each i. Here, 5(q, s ) is the state that M enters after reading s, starting at state q. (Note that F is irrelevant here). 7.36 Show that if P = NP, a polynomial time algorithm exists that produces a satisfying assignment when given a satisfiable Boolean formula. (Note: The algorithm you are asked to provide computes a function, but NP contains languages, not func¬ tions. The P = NP assumption implies that SAT is in P, so testing satisfiability is solvable in polynomial time. But the assumption doesn’t say how this test is done, and the test may not reveal satisfying assignments. You must show that you can find them anyway. Hint: Use the satisfiability tester repeatedly to find the assignment bit-by-bit.) PROBLEMS 299 *7.37 Show that if P = NP, you can factor integers in polynomial time. (See the note in Problem 7.36.) u 7.38 Show that if P = NP, a polynomial time algorithm exists that takes an undirected graph as input and finds a largest clique contained in that graph. (See the note in Problem 7.36.) 7.39 In the proof of the Cook-Levin theorem, a window is a 2 x 3 rectangle of cells. Show why the proof would have failed if we had used 2x2 windows instead. *7.40 Consider the algorithm MINIMIZE, which takes a DFA M as input and outputs DFA M'. MINIMIZE = “On input (M), where M = ( Q , E, S. qo, A) is a DFA: 1. Remove all states of M that are unreachable from the start state. 2. Construct the following undirected graph G whose nodes are the states of M. 3. Place an edge in G connecting every accept state with every nonaccept state. Add additional edges as follows. 4. Repeat until no new edges are added to G: 5. For every pair of distinct states q and r of M and every a £ E: 6. Add the edge ( q , r) to G if (S(q, a),S(r, a)) is an edge of G. 7. For each state q, let [<j] be the collection of states [q] = {r £ Q\ no edge joins q and r in G}. 8. Form a new DFA M' = (Q 1 , E, S' , qo', A') where Q' = {[q]| <? € Q}, (if [<?] = [r], only one of them is in Q'), <5'([q], a) = [<5(q,a)], for every q £ Q and a £ E, Qo' = [?o], and A' = {[q}\qeA}. 9. Output (M')” a. Show that M and M' are equivalent. b. Show that M' is minimal—that is, no DFA with fewer states recognizes the same language. You may use the result of Problem 1.52 without proof. c. Show that MINIMIZE operates in polynomial time. 7.41 For a cnf-formula <t> with m variables and c clauses, show that you can construct in polynomial time an NFA with 0(cm) states that accepts all nonsatisfying assign¬ ments, represented as Boolean strings of length m. Conclude that NFAs cannot be minimized in polynomial time unless P = NP. *7.42 A 2cnf-formula is an AND of clauses, where each clause is an OR of at most two literals. Let 2SAT = {(d>)| 0 is a satisfiable 2cnf-formula}. Show that 2SAT e P. 7.43 Modify the algorithm for context-free language recognition in the proof of The¬ orem 7.16 to give a polynomial time algorithm that produces a parse tree for a string, given the string and a CFG, if that grammar generates the string. 7.44 Say that two Boolean formulas are equivalent if they have the same set of variables and are true on the same set of assignments to those variables (i.e., they describe the same Boolean function). A Boolean formula is minimal if no shorter Boolean formula is equivalent to it. Let MIN-FORMULA be the collection of minimal Boolean formulas. Show that, if P = NP, then MIN-FORMULA 6 P. 300 CHAPTER 7 / TIME COMPLEXITY 7.45 The difference hierarchy D,P is defined recursively as a. DiP = NP and b. D;P = {A\A = B\C for B in NP and C in D;_iP}. (Here B \ C = B n C.) For example, a language in D 2 P is the difference of two NP languages. Sometimes D 2 P is called DP (and may be written D p ). Let Z = {(Gi, fci, G 2 , k^)\ G 1 has a k \-clique and G 2 doesn’t have a fo-clique}. Show that Z is complete for DP. In other words, show that every language in DP is polynomial time reducible to Z. 7.46 Let MAX-CLIQUE = {(G, /c)| the largest clique in G is of size exactly k}. Use the result of Problem 7.45 to show that MAX-CLIQUE is DP-complete. 7.47 Let f: Af —► N be any function wheref(n) = o(n log n). Show that TIME(/(n)) contains only the regular languages. 7.48 Call a regular expression star-free if it does not contain any star operations. Let £Qsf-rex = {(-R, 5)| R and S are equivalent star-free regular expressions}. Show that EQsf-rex' s m coNP. Why does your argument fail for general regular ex¬ pressions? 7.49 This problem investigates resolution, a method for proving the unsatisfiability of cnf-formulas. Let <j> = Ci A C2 A • ■ ■ A Cm be a formula in cnf, where the Ci are its clauses. Let C = {Ci | G, is a clause of <fi}. In a resolution step we take two clauses C a and Cb in C which both have some variable x, occurring positively in one of the clauses and negatively in the other. Thus C a = (x V yi V t /2 V • • • V yt ) and Cb = (x V 21 V Z 2 V • ■ • V zQ, where the yi and Zi are literals. We form the new clause (j/i V 2/2 V • • • V t/fc V zi V zi V • • • V zQ and remove repeated literals. Add this new clause to C. Repeat the resolution steps until no additional clauses can be obtained. If the empty clause () is in C then declare (f> unsatisfiable. Say that resolution is sound if it never declares satisfiable formulas to be unsatisfi¬ able. Say that resolution is complete if all unsatisfiable formulas are declared to be unsatisfiable. a. Show that resolution is sound and complete. b. Use part (a) to show that 2SAT e P. fi if i & h i 1 i 1? SELECTED SOLUTIONS 7.1 (c) FALSE; (d) TRUE. 7.2 (c) TRUE; <d) TRUE. SELECTED SOLUTIONS 301 7.15 Let A £ NP. Construct NTM M to decide A in nondeterministic polynomial time. M = “On input w: 1. Nondeterministically divide w into pieces w = X 1 X 2 ■ ■ ■ Xk■ 2. For each x t , nondeterministically guess the certificates that show Xi £ A. 3. Verify all certificates if possible, then accept. Otherwise if verification fails, reject.” 7.22 We give a polynomial time mapping reduction from CLIQUE to HALF-CLIQUE. The input to the reduction is a pair (G, k) and the reduction produces the graph (H) as output where H is as follows. If G has m nodes and k = m/2 then H = G. If k < m/2, then H is the graph obtained from G by adding j nodes, each connected to every one of the original nodes and to each other, where j = rn — 2k. Thus H has m + j = 2m — 2k nodes. Observe that G has a fc-clique iff H has a clique of size k+j = m — k and so (G, k) £ CLIQUE iff { H } £ HALF-CLIQUE. If k > 2m, then H is the graph obtained by adding j nodes to G without any additional edges, where j = 2k — m. Thus H has m + j = 2k nodes, and so G has a fc-clique iff H has a clique of size k. Therefore (G, kk) 6 CLIQUE iff (H) £ HALF-CLIQUE. We also need to show HALF-CLIQUE £ NP. The certificate is simply the clique. 7.31 First, SOLITAIRE £ NP because we can verify that a solution works, in polynomial time. Second, we show that 3SAT <p SOLITAIRE. Given <j> with m variables xi,..., x m and k clauses ci,..., Ck, construct the following k x m game G. We assume that <j> has no clauses that contain both x, and xi because such clauses may be removed without affecting satisfiability. If Xi is in clause Cj put a blue stone in row c 3 , column xi. If xi is in clause Cj put a red stone in row Cj, column x r . We can make the board square by repeating a row or adding a blank column as necessary without affecting solvability. We show that 0 is satisfiable iff G has a solution. (—>) Take a satisfying assignment. If x,_ is true (false), remove the red (blue) stones from the corresponding column. So, stones corresponding to true literals remain. Because every clause has a true literal, every row has a stone. (<—) Take a game solution. If the red (blue) stones were removed from a column, set the corresponding variable true (false). Every row has a stone remaining, so every clause has a true literal. Therefore <p is satisfied. 7.38 If you assume that P = NP, then CLIQUE £ P, and you can test whether G con¬ tains a clique of size k in polynomial time, for any value of k. By testing whether G contains a clique of each size, from 1 to the number of nodes in G, you can deter¬ mine the size t of a maximum clique in G in polynomial time. Once you know t, you can find a clique with t nodes as follows. For each node x of G, remove x and calculate the resulting maximum clique size. If the resulting size decreases, replace x and continue with the next node. If the resulting size is still t, keep x perma¬ nently removed and continue with the next node. When you have considered all nodes in this way, the remaining nodes are a f-clique. EXERCISES 329 EXERCISES 8.1 Show that for any function f: J\f —where /(re) > n, the space complexity class SPACE(/(n)) is the same whether you define the class by using the single¬ tape TM model or the two tape read-only input TM model. 8.2 Consider the following position in the standard tic-tac-toe game. X O O X Let’s say that it is the X -player’s turn to move next. Describe the winning strategy for this player. (Recall that a winning strategy isn’t merely the best move to make in the current position. It also includes all the responses that this player must make in order to win, however the opponent moves.) 8.3 Consider the following generalized geography game wherein the start node is the one with the arrow pointing in from nowhere. Does Player I have a winning strat¬ egy? Does Player II? Give reasons for your answers. 8.4 Show that PSPACE is closed under the operations union, complementation, and star. a 8.5 Show that NL is closed under the operations union, intersection, and star. 8.6 Show that any PSPACE-hard language is also NP-hard. a 8.7 Show that Adfa € L. PROBLEMS 8.8 Let EQrex = {{A. •S)! R and S are equivalent regular expressions}. Show that EQ rex e PSPACE. 330 CHAPTER 8 / SPACE COMPLEXITY 8.9 A ladder is a sequence of strings si, S 2 , ... ,Sk, wherein every string differs from the preceding one in exactly one character. For example the following is a ladder of English words, starting with “head” and ending with “free”: head, hear, near, fear, bear, beer, deer, deed, feed, feet, fret, free. Let LADDERdfa = {(M, s, t) \ M is a DFA and L(M ) contains a ladder of strings, starting with s and ending with t}. Show that LADDERdfa is in PSPACE. 8.10 The Japanese game go-moku is played by two players, “X” and “O,” on a 19 x 19 grid. Players take turns placing markers, and the first player to achieve 5 of his markers consecutively in a row, column, or diagonal, is the winner. Consider this game generalized to an n x n board. Let GM = {(B)\ B is a position in generalized go-moku, where player “X” has a winning strategy}. By a position we mean a board with markers placed on it, such as may occur in the middle of a play of the game, together with an indication of which player moves next. Show that GM £ PSPACE. 8.11 Show that, if every NP-hard language is also PSPACE-hard, then PSPACE = NP. 8.12 Show that TQBF restricted to formulas where the part following the quantifiers is in conjunctive normal form is still PSPACE-complete. 8.13 Define Alba = {(M,w)\ M is an LBA that accepts input w}. Show that Alba is PSPACE-complete. 8.14 Consider the following two-person version of the language PUZZLE that was de¬ scribed in Problem 7.26. Each player starts with an ordered stack of puzzle cards. The players take turns placing the cards in order in the box and may choose which side faces up. Player I wins if, in the final stack, all hole positions are blocked, and Player II wins if some hole position remains unblocked. Show that the problem of determining which player has a winning strategy for a given starting configuration of the cards is PSPACE-complete. *8.15 The cat-and-mouse game is played by two players, “Cat” and “Mouse,” on an arbi¬ trary undirected graph. At a given point each player occupies a node of the graph. The players take turns moving to a node adjacent to the one that they currently occupy. A special node of the graph is called “Hole.” Cat wins if the two players ever occupy the same node. Mouse wins if it reaches the Hole before the preceding happens. The game is a draw if a situation repeats (i.e., the two players simultane¬ ously occupy positions that they simultaneously occupied previously and it is the same player’s turn to move). HAPPY-CAT = {{G,c, m,h} \ G, c, m, h, are respectively a graph, and positions of the Cat, Mouse, and Hole, such that Cat has a winning strategy if Cat moves first}. Show that HAPPY-CAT is in P. (Hint: The solution is not complicated and doesn’t depend on subtle details in the way the game is defined. Consider the entire game tree. It is exponentially big, but you can search it in polynomial time.) PROBLEMS 331 8.16 Read the definition of MIN-FORMULA in Problem 7.44. a. Show that MIN-FORMULA £ PSPACE. b. Explain why this argument fails to show that MIN-FORMULA £ coNP: If 0 0 MIN-FORMULA, then <p has a smaller equivalent formula. An NTM can verify that 0 £ MIN-FORMULA by guessing that formula. 8.17 Let A be the language of properly nested parentheses. For example, (()) and (()(()))() are in A, but ) ( is not. Show that A is in L. *8.18 Let B be the language of properly nested parentheses and brackets. For example, ([()()]()[]) is in £1 but ( [) ] is not. Show that B is in L. *8.19 The game of Nim is played with a collection of piles of sticks. In one move a player may remove any nonzero number of sticks from a single pile. The players alternately take turns making moves. The player who removes the very last stick loses. Say that we have a game position in Nim with k piles containing si, ... ,Sk sticks. Call the position balanced if, when each of the numbers .s, is written in binary and the binary numbers are written as rows of a matrix aligned at the low order bits, each column of bits contains an even number of Is. Prove the following two facts. a. Starting in an unbalanced position, a single move exists that changes the position into a balanced one. b. Starting in a balanced position, every single move changes the position into an unbalanced one. Let NIM = {{si, ..., Sfc)| each s% is a binary number and Player I has a winning strategy in the Nim game starting at this position}. Use the preceding facts about balanced positions to show that NIM £ L. 8.20 Let MULT = {a#b#e| where a, b, c are binary natural numbers and a x b = c}. Show that MULT 6 L. 8.21 For any positive integer x, let x R be the integer whose binary representation is the reverse of the binary representation of x. (Assume no leading Os in the binary representation of x.) Define the function TZ + : N —->A f where TZ + (x) = x + x n . a. Let Ai = {(m, j/) 1 1Z + (x) = y}. Show A 2 e L. b. Let T 3 = {(x,y)\ R + (TZ + (x)) = y}. Show A 3 € L. 8.22 a. Let ADD = {(x, y, z) | x,y, z > 0 are binary integers and x + y = z}. Show that ADD £ L. b. Let PAL-ADD — {(x,y)\ x,y > 0 are binary integers where x + y is an integer whose binary representation is a palindrome}. (Note that the binary representation of the sum is assumed not to have leading zeros. A palin¬ drome is a string that equals its reverse). Show that PAL-ADD £ L. *8.23 Define UCYCLE = {{G)| G is an undirected graph that contains a simple cycle}. Show that UCYCLE £ L. (Note: G may be a graph that is not connected.) ‘8.24 For each n, exhibit two regular expressions, R and S, of length poly(n), where L(R) ^ L(S), but where the first string on which they differ is exponentially long. In other words, L(R) and L(S) must be different, yet agree on all strings of length 2 "' for some constant e > 0 . 332 CHAPTER 8 / SPACE COMPLEXITY An undirected graph is bipartite if its nodes may be divided into two sets so that all edges go from a node in one set to a node in the other set. Show that a graph is bipartite if and only if it doesn’t contain a cycle that has an odd number of nodes. Let BIPARTITE = {(G}| G is a bipartite graph}. Show that BIPARTITE £ NL. Define UPATH to be the counterpart of PATH for undirected graphs. Show that BIPARTITE <l UPATH. (Note: As this edition was going to press, O. Rein¬ gold [60] announced that UPATH £ L. Consequently, BIPARTITE £ L, but the algorithm is somewhat complicated.) Recall that a directed graph is strongly connected if every two nodes are connected by a directed path in each direction. Let STRONGLY-CONNECTED = {(G)| G is a strongly connected graph}. Show that STRONGLY-CONNECTED is NL-complete. 8.28 Let BOTH NF a = {(Mi,M 2 )| AT and M 2 are NFAs where L(Mi) n L(M 2 ) ± 0}. Show that BOTHnfa is NL-complete. 8.29 Show that Anfa is NL-complete. 8.30 Show that Eo fa is NL-complete. *8.31 Show that 2SAT is NL-complete. *8.32 Give an example of an NL-complete context-free language. a * 8.33 Define CYCLE — {(G)| G is a directed graph that contains a directed cycle}. Show that CYCLE is NL-complete. 8.25 8.26 8.27 SELECTED SOLUTIONS 8.5 Let A\ and A 2 be languages that are decided by NL-machines Ah and IV 2 . Con¬ struct three Turing machines: Nu deciding Ai U A 2 ; N 0 deciding Ai o A 2 ; and N, deciding A\. Each of these machines receives input w. Machine N u nondeterministically branches to simulate A"j or to simulate IV 2 . In either case, N u accepts if the simulated machine accepts. Machine N a nondeterministically selects a position on the input to divide it into two substrings. Only a pointer to that position is stored on the work tape— insufficient space is available to store the substrings themselves. Then N 0 simulates Ni on the first substring, branching nondeterministically to simulate NTs nonde¬ terminism. On any branch that reaches NTs accept state, N 0 simulates IV 2 on the second substring. On any branch that reaches IV 2 ’s accept state, N a accepts. Machine IV, has a more complex algorithm, so we describe its stages. N, = “On input w: 1. Initialize two input position pointers pi and p 2 to 0, the position immediately preceding the first input symbol. 2. Accept if no input symbols occur after p 2 . 3. Move p 2 forward to a nondeterministically selected input posi¬ tion. SELECTED SOLUTIONS 333 4. Simulate Ni on the substring of w from the position following pi to the position at p 2 , branching nondeterministically to sim¬ ulate Ni ’s nondeterminism. 5. If this branch of the simulation reaches IVi’s accept state, copy P 2 to pi and go to stage 2.” 8.7 Construct a TM M to decide Aqfa- When M receives input (A,w), a DFA and a string, M simulates A on w by keeping track of A’s current state and its current head location, and updating them appropriately. The space required to carry out this simulation is 0 (log n) because M can record each of these values by storing a pointer into its input. 8.33 Reduce PATH to CYCLE. The idea behind the reduction is to modify the PATH problem instance (G, s, t) by adding an edge from t to s in G. If a path exists from s to t in G, a directed cycle will exist in the modified G. However, other cycles may exist in the modified G because they may already be present in G. To handle that problem, first change G so that it contains no cycles. A leveled directed graph is one where the nodes are divided into groups, Ai , Ai,.... Ak, called levels, and only edges from one level to the next higher level are permitted. Observe that a leveled graph is acyclic. The PATH problem for leveled graphs is still NL-complete, as the following reduction from the unrestricted PATH problem shows. Given a graph G with two nodes s and t, and m nodes in total, produce the leveled graph G' whose levels are m copies of G’s nodes. Draw an edge from node i at each level to node j in the next level if G contains an edge from i to j. Additionally, draw an edge from node i in each level to node i in the next level. Let s' be the node s in the first level and let t 1 be the node t in the last level. Graph G contains a path from s to t iff G' contains a path from s' to t'. If you modify G' by adding an edge from t! to s', you obtain a reduction from PATH to CYCLE. The reduction is computationally sim¬ ple, and its implementation in logspace is routine. Furthermore, a straightforward procedure shows that CYCLE € NL. Hence CYCLE is NL-complete. EXERCISES 361 EXERCISES a 9.1 Prove that TIME(2 n ) = TIME(2" +1 ). a 9.2 Prove that TIME(2 n ) C TIME(2 2n ). a 9.3 Prove that NTIME(n) C PSPACE. 9.4 Show how the circuit depicted in Figure 9.26 computes on input 0110 by showing the values computed by all of the gates, as we did in Figure 9.24. 9.5 Give a circuit that computes the parity function on three input variables and show how it computes on input Oil. 9.6 Prove that if A 6 P then P A = P. 9.7 Give regular expressions with exponentiation that generate the following languages over the alphabet {0,1}. A a. All strings of length 500 A b. All strings of length 500 or less A c. All strings of length 500 or more A d. All strings of length different than 500 e. All strings that contain exactly 500 Is f. All strings that contain at least 500 Is g. All strings that contain at most 500 Is h. All strings of length 500 or more that contain a 0 in the 500th position i. All strings that contain two 0s that have at least 500 symbols between them 9.8 If R is a regular expression, let ’ represent the expression R m u R m +1 ,J . . . u Show how to implement the operator, using the ordinary exponentiation operator, but without “• • • ”. 9.9 Show that if NP = P M , then NP = coNP. 9.10 Problem 8.13 showed that .4 lba is PSPACE-complete. a. Do we know whether Alba € NL? Explain your answer. b. Do we know whether Alba 6 P? Explain your answer. 9.11 Show that the language MAX-CLIQUE from Problem 7.46 is in P M/ . m $ a ¥ w n & m* » # » r* *; m * £ -> '<* m PROBLEMS 9.12 Describe the error in the following fallacious “proof” that P/NP. Assume that P=NP and obtain a contradiction. If P=NP, then SAT 6 P and so for some k, SAT eTIME(n fc ). Because every language in NP is polynomial time reducible to SAT, you have NP C TIME(n fc ). Therefore P C TIME(n fc ). But, by the time hierarchy theorem, TIME(n fc+1 ) contains a language that isn’t in TIME(n fc ), which contradicts P C TIME(n fc ). Therefore P NP. 362 CHAPTER 9 / INTRACTABILITY 9.13 Consider the function pad: £* x M —>£*#* that is defined as follows. Let pad(s, l) = s# J , where j = max(0, l — m ) and m is the length of s. Thus pad(s, l) simply adds enough copies of the new symbol # to the end of s so that the length of the result is at least l. For any language A and function /: AT —define the language pad(A, f(m)) as pad(A, f(m)) = {pad(s, f(m)) | where s£d and m is the length of s}. Prove that, if A £ TIME(n 6 ), then pad(A,n 2 ) 6 TIME(n 3 ). 9.14 Prove that, if NEXPTIME / EXPTIME, then P ^ NP. You may find the function pad, defined in Problem 9.13, to be helpful. a 9.15 Define pad as in Problem 9.13. a. Prove that, for every A and natural number k, A £ P iff pad(A, n k ) £ P. b. Prove that P / SPACE(n). 9.16 Prove that TQBF 0 SPACE(n 1/3 ) *9.17 Read the definition of a 2DFA (two-headed finite automaton) given in Prob¬ lem S.26. Prove that P contains a language that is not recognizable by a 2DFA. 9.18 Let Eregt = {{R) I J? is a regular expression with exponentiation and L(R) = 0). Show that Erect e P- 9.19 Define the unique-sat problem to be USAT = {(0)| <fi is a Boolean formula that has a single satisfying assignment}. Show that USAT € P SAT . 9.20 Prove that an oracle C exists for which NP C A coNP c . 9.21 A k-query oracle Turing machine is an oracle Turing machine that is permitted to make at most k queries on each input. A fc-oracle TM M with an oracle for A is written M A ' k and P A,fc is the collection of languages that are decidable by polynomial time fc-oracle ATMs. a. Show that NP U coNP C P S/fr ’ 1 . b. Assume that NP ^ coNP. Show that P U coNP C P‘ <,vfr ’ 1 . 9.22 Suppose that A and B are two oracles. One of them is an oracle for TQBF, but you don’t know which. Give an algorithm that has access to both A and B and that is guaranteed to solve TQBF in polynomial time. 9.23 Define the function parity n as in Example 9.25. Show that parity n can be com¬ puted with 0(n ) size circuits. 9.24 Recall that you may consider circuits that output strings over {0,1} by designating several output gates. Let add n : (0,1 } 2rl —>{0,l}" +1 take the sum of two n bit binary integers and produce the n + 1 bit result. Show that you can compute the add n function with 0(n) size circuits. SELECTED SOLUTIONS 363 9.25 Define the function majority n : {0,1}”—>{0,1} as majority n {x u ... ,x n ) = 1° S ’ Thus the majority „ function returns the majority vote of the inputs. Show that majority n can be computed with a. 0(n 2 ) size circuits. b. 0(n log n) size circuits. (Hint: Recursively divide the number of inputs in half and use the result of Problem 9.24.) '9.26 Define the problem majority n as in Problem 9.25. Show that it may be computed with 0{n) size circuits. SELECTED SOLUTIONS 9.1 The time complexity classes are defined in terms of the big-O notation, so constant factors have no effect. The function 2 n+1 is 0(2 n ) and thus A £ TIME(2 n ) iff A € TIME(2 n+1 ). 9.2 The containment TIME(2 n ) C TIME(2 2r ‘) holds because 2 n < 2 2n . The con¬ tainment is proper by virtue of the time hierarchy theorem. The function 2 2n is time constructible because a TM can write the number 1 followed by 2 n Os in 0(2 2n ) time. Hence the theorem guarantees that a language A exists that can be decided in 0(2 2n ) time but not in 0(2 2 "/ log 2 2n ) = 0(2 2n /2n) time. Therefore A € TIME(2 2n ) but A 0 TIME(2 n ). 9.3 NTIME(n) C NSPACE(n) because any Turing machine that operates in time t(n) on every computation branch can use at most t(n) tape cells on every branch. Furthermore NSPACE(n) C SPACE(n 2 ) due to Savitch’s theorem. However, SPACE(n 2 ) C SPACE(n 3 ) because of the space hierarchy theorem. The result follows because SPACE(ra 3 ) C PSPACE. 9.1 (a) E 500 ; (b)(EUe) 500 ; (c) E 500 E* ; (d) (E U e) 499 U E 501 E*. 9.15 (a) Let A be any language and k £ J\f. If A G P, then pad(A,n k ) e P because you can determine whether w £ pad(A , n k ) by writing w as s# 1 where s doesn’t contain the # symbol, then testing whether |uij = |s| fc , and finally testing whether s £ A. Implementing the first test in polynomial time is straightforward. The second test runs in time poly(|w|), and because |w| is poly(|s|), the test runs in time poly(|s|) and hence is in polynomial time. If pad(A, n k ) £ P, then A € P because you can determine whether w £ A by padding w with # symbols until it has length !u>| fc and then testing whether the result is in pad(A, n k ). Both of these tests require only polynomial time. 364 CHAPTER 9 / INTRACTABILITY (b) Assume that P = SPACE(n). Let A be a language in SPACE(n 2 ) but not in SPACE(n) as shown to exist in the space hierarchy theorem. The language pad(A,n 2 ) £ SPACE(n) because you have enough space to run the 0(n 2 ) space algorithm for A, using space that is linear in the padded language. Because of the assumption, pad(A,n 2 ) £ P, hence A 6 P by part (a), and hence A £ SPACE(n), due to the assumption once again. But that is a contradiction. EXERCISES 411 We can use a trapdoor function such as the RSA trapdoor function, to con¬ struct a public-key cryptosystem as follows. The public key is the index i gener¬ ated by the probabilistic machine G. The secret key is the corresponding value t. The encryption algorithm breaks the message m into blocks of size at most log N. For each block w the sender computes ,f t . The resulting sequence of strings is the encrypted message. The receiver uses the function h to obtain the original message from its encryption. i 'I $ -4 * U X 9* > W if P i* # W. iff if ii EXERCISES 10.1 Show that a circuit family with depth O(logn) is also a polynomial size circuit family. 10.2 Show that 12 is not pseudoprime because it fails some Fermat test. 10.3 Prove that, if A <l B and B is in NC, then A is in NC. 10.4 Show that the parity function with n inputs can be computed by a branching pro¬ gram that has O(n) nodes. 10.5 Show that the majority function with n inputs can be computed by a branching program that has 0(n 2 ) nodes. 10.6 Show that any function with n inputs can be computed by a branching program that has 0(2 n ) nodes. a 10.7 Show that BPP C PSPACE. I* H K I ft ‘ I | I 2 1 I i it & . - m i a m PROBLEMS 10.8 Let A be a regular language over {0,1}. Show that A has size-depth complexity (0(n),0(logn)). * 10.9 A Boolean formula is a Boolean circuit wherein every gate has only one output wire. The same input variable may appear in multiple places of a Boolean formula. Prove that a language has a polynomial size family of formulas iff it is in NC 1 . Ignore uniformity considerations. *10.10 A k-head pushdown automaton (fc-PDA) is a deterministic pushdown automaton with k read-only, two-way input heads and a read/write stack. Define the class PDA*, = {A| A is recognized by a fc-PDA}. Show that P = |J fe PDAfc. (Hint: Recall that P equals alternating log space.) 412 CHAPTER 10/ ADVANCED TOPICS IN COMPLEXITY THEORY 10.11 Let M be a probabilistic polynomial time Turing machine and let C be a language where, for some fixed 0 < ei < £2 < L a. w £ C implies Pr[M accepts tt>] < ei, and b. w € C implies Pr[M accepts w] > e 2 . Show that C 6 BPP. (Hint: Use the result of Lemma 10.5.) 10.12 Show that, if P = NP, then P = PH. 10.13 Showthat, if PH — PSPACE, then the polynomial time hierarchy has only finitely many distinct levels. 10.14 Recall that NP ,S/I7 is the class of languages that are decided by nondeterminis- tic polynomial time Turing machines with an oracle for the satisfiability problem. Show that NP W = E 2 P. *10.15 Prove Fermat’s little theorem, which is given in Theorem 10.6. (Hint: Consider the sequence a 1 , a 2 ,... . What must happen, and how?) A * 10.16 Prove that, for any integer p > 1, if p isn’t pseudoprime, then p fails the Fermat test for at least half of all numbers in Z v . 10.17 Prove that, if A is a language in L, a family of branching programs (Bi,B 2 , • • • ) exists wherein each B n accepts exactly the strings in A of length n and is bounded in size by a polynomial in n. 10.18 Prove that, if A is a regular language, a family of branching programs (B\, B 2 ,...) exists wherein each B„ accepts exactly the strings in A of length n and is bounded in size by a constant times n. 10.19 Show that, if NP C BPP then NP = RP. 10.20 Define a ZPP-machine to be a probabilistic Turing machine which is permitted three types of output on each of its branches: accept, reject, and ?. A ZPP-machine M decides a language A if M outputs the correct answer on every input string w (accept if ® £ A and reject if w 0 A) with probability at least |, and M never outputs the wrong answer. On every input, M may output ? with probability at most |. Furthermore, the average running time over all branches of M on w must be bounded by a polynomial in the length of w. Show that RP fl coRP = ZPP. 10.21 Let EQ BP = {{Bi,B 2 )| B 1 and B 2 are equivalent branching programs}. Show that EQ BP is coNP-complete 10.22 Let BPL be the collection of languages that are decided by probabilistic log space Turing machines with error probability |. Prove that BPL C P. . < ■* S W * • S K •: J t K ? ! S * > 1 * K*i4' ! X VI Si ift « 9 % 3: SELECTED SOLUTIONS 10.7 If M is a probabilistic TM that runs in polynomial time, we can modify M so that it makes exactly n r coin tosses on each branch of its computation, for some con¬ stant r. Thus the problem of determining the probability that M accepts its input string reduces to counting how many branches are accepting and comparing this number with §2^ n 1 . This count can be performed by using polynomial space. SELECTED SOLUTIONS 413 10.16 Call a a witness if it fails the Fermat test for p, that is, if a p 1 ^ 1 (mod p). Let Zp be all numbers in {1, ...,p — 1} that are relatively prime to p. If p isn’t pseudoprime, it has a witness a in Z p . Use a to get many more witnesses. Find a unique witness in Z* for each nonwit¬ ness. If d £ Z* is a nonwitness, you have dP 1 = 1 (mod p). Hence da mod p ^ 1 (mod p ) and so da mod p is a witness. If d\ and d 2 are distinct nonwitnesses in Z* then dia mod p ^ mod p. Otherwise (di — d 2 )a = 0 (mod p), and thus (di —d^)a = cp for some integer c. But di and di are in Z*, and thus (di — tfe) < p, so a — cp/(d\ - d- 2 ) and p have a factor greater than 1 in common, which is im¬ possible because a and p are relatively prime. Thus the number of witnesses in Z* must be as large as the number of nonwitnesses in Z* and consequently at least half of the members of Z* are witnesses. Next show that every member b of Z p that is not relatively prime to p is a witness. If b and p share a factor, then b e and p share that factor for any e > 0. Hence V‘ 1 ^ 1 (mod p). Therefore you can conclude that at least half of the member of Z p are witnesses. Selected Bibliography 1. ADLEMAN, L. Two theorems on random polynomial time. In Proceedings of the Nineteenth IEEE Symposium on Foundations of Computer Science (1978), pp. 75-83. 2. ADLEMAN, L. M., AND HUANG, M. A. Recognizing primes in random polynomial time. In Proceedings of the Nineteenth Annual ACM Symposium on the Theory of Computing (1987), pp. 462-469. 3. ADLEMAN, L. M., POMERANCE, C, AND RUMELY, R. S. On distinguish¬ ing prime numbers from composite numbers. Annals of Mathematics 117 (1983), 173-206. 4. AGRAWAL, M., KAYAL, N., AND SAXENA, N. PRIMES is in P. (2002), http://www.cse.iitk.ac.in/news/primality.pdf. 5. AHO, A. V., HOPCROFT, J. E., AND ULLMAN, J. D. Data Structures and Algorithms. Addison-Wesley, 1982. 6. AHO, A. V., SETHI, R., AND ULLMAN, J. D. Compilers: Principles , Tech¬ niques, Tools. Addison-Wesley, 1986. 7. AKL, S. G. The Design and Analysis of Parallel Algorithms. Prentice-Hall International, 1989. 8. ALON, N., Erdos, P., AND Spencer, J. H. The Probabilistic Method. John Wiley & Sons, 1992. 9. ANGLUIN, D., AND VALIANT, L. G. Fast probabilistic algorithms for Hamiltonian circuits and matchings. Journal of Computer and System Sciences 18 (1979), 155-193. 10. Arora, S., Lund, C., Motwani, R., Sudan, M., and Szegedy, M. Proof verification and hardness of approximation problems. In Proceedings of the Thirty-third IEEE Symposium on Foundations of Computer Science (1992), pp. 14—23. 11. BAASE, S. Computer Algorithms: Introduction to Design and Analysis. Addison- Wesley, 1978. 12. BABAI, L. E-mail and the unexpected power of interaction. In Proceedings of the Fifth Annual Conference on Structure in Complexity Theory (1990), pp. 30-44. 13. BACH, E., and SHALLIT, J. Algorithmic Number Theory, Vol. 1. MIT Press, 1996. 415 416 14. BALCAZAR, J. L., DIAZ, J., and GABARRO, J. Structural Complexity I, II. EATCS Monographs on Theoretical Computer Science. Springer Verlag, 1988 (I) and 1990 (II). 15. BEAME, P. W., Cook, S. A., AND Hoover, H. J. Log depth circuits for division and related problems. SIAM Journal on Computing 15, 4 (1986), 994—1003. 16. BLUM, M., Chandra, A., AND Wegman, M. Equivalence of free boolean graphs can be decided probabilistically in polynomial time. Information Pro¬ cessing Letters 10 (1980), 80-82. 17. BRASSARD, G., AND BRATLEY, P. Algorithmics: Theory and, Practice. Pren¬ tice-Hall, 1988. 18. CARMICHAEL, R. D. On composite numbers p which satisfy the Fermat congruence a p_1 = p. American Mathematical Monthly 19 (1912), 22-27. 19. CHOMSKY, N. Three models for the description of language. IRE Trans, on Information Theory 2 (1956), 113-124. 20. COBHAM, A. The intrinsic computational difficulty of functions. In Pro¬ ceedings of the International Congress for Logic, Methodology, and Philosophy of Science , Y. Bar-Hillel, Ed. North-Holland, 1964, pp. 24-30. 21. COOK, S. A. The complexity of theorem-proving procedures. In Proceed¬ ings of the Third Annual ACM Symposium on the Theory of Computing (1971), pp. 151-158. 22. CORMEN, T., LEISERSON, C, AND RlVEST, R. Introduction to Algorithms. MIT Press, 1989. 23. EDMONDS, J. Paths, trees, and flowers. Canadian Journal of Mathematics 17 (1965), 449-467. 24. ENDERTON, H. B. A Mathematical Introduction to Logic. Academic Press, 1972. 25. EVEN, S. Graph Algorithms. Pitman, 1979. 26. FELLER, W. An Introduction to Probability Theory and Its Applications, Vol. 1. John Wiley & Sons, 1970. 27. FEYNMAN, R. P., Hey, A. J. G., AND Allen, R. W. Feynman lectures on computation. Addison-Wesley, 1996. 28. CAREY, M. R., AND Johnson, D. S. Computers and Intractability — A Guide to the Theory of NP-completeness. W. H. Freeman, 1979. 29. GILL, J. T. Computational complexity of probabilistic Turing machines. SIAM Journal on Computing 6, 4 (1977), 675-695. 30. GODEL, K. On formally undecidable propositions in Principia Mathematica and related systems I. In The Undecidable, M. Davis, Ed. Raven Press, 1965, pp. 4-38. 31. GOEMANS, M. X., AND WILLIAMSON, D. P. .878-approximation algo¬ rithms for MAX CUT and MAX 2 SAT. In Proceedings of the Twenty-sixth Annual ACM Symposium on the Theory of Computing (1994), pp. 422-431. 32. GOLDWASSER, S., AND MlCALI, S. Probabilistic encryption. Journal of Computer and System Sciences (1984), 270-229. 417 33. Goldwasser, S., Micali, S., AND Rackoff, C. The knowledge com¬ plexity of interactive proof-systems. SIAM Journal on Computing (1989), 186-208. 34. Greenlaw, R., Hoover, H. J., and Ruzzo, W. L. Limits to Parallel Computation: P-completeness Theory. Oxford University Press, 1995. 35. HARARY, F. Graph Theory, 2d ed. Addison-Wesley, 1971. 36. HARTMANIS, J., AND STEARNS, R. E. On the computational complexity of algorithms. Transactions of the American Mathematical Society 111 (1965), 285-306. 37. HILBERT, D. Mathematical problems. Lecture delivered before the In¬ ternational Congress of Mathematicians at Paris in 1900. In Mathematical Developments Arising from Hilbert Problems, vol. 28. American Mathematical Society, 1976, pp. 1-34. 38. HOFSTADTER, D. R. Goedel, Escher, Bach: An Eternal Golden Braid. Basic Books, 1979. 39. HOPCROFT, J. E., AND ULLMAN, J. D. Introduction to Automata Theory, Languages and Computation. Addison-Wesley, 1979. 40. JOHNSON, D. S. The NP-completeness column: Interactive proof systems for fun and profit. Journal of Algorithms 9, 3 (1988), 426-444. 41. KARP, R. M. Reducibility among combinatorial problems. In Complexity of Computer Computations (1972), R. E. Miller and J. W. Thatcher, Eds., Plenum Press, pp. 85-103. 42. KARP, R. M., AND LlPTON, R. J. Turing machines that take advice. EN- SEIGN: LEnseignement Mathematique Revue Internationale 28 (1982). 43. LAWLER, E. L. Combinatorial Optimization: Networks and Matroids. Holt, Rinehart and Winston, 1991. 44. Lawler, E. L., Lenstra, J. K., Rinooy Kan, A. H. G., and Shmoys, D. B. The Traveling Salesman Problem. John Wiley & Sons, 1985. 45. LEIGHTON, F. T. Introduction to Parallel Algorithms and Architectures: Airay, Trees, Hypercubes. Morgan Kaufmann, 1991. 46. Levin, L. Universal search problems (in Russian). Problemy Peredachi Infor- matsii 9, 3 (1973), 115-116. 47. LEWIS, H., AND PAPADIMITRIOU, C. Elements of the Theory of Computation. Prentice-Hall, 1981. 48. Li, M., AND VlTANYI, P. Introduction to Kolmogorov Complexity and its Appli¬ cations. Springer-Verlag, 1993. 418 49. LICHTENSTEIN, D., and Sipser, M. GO is PSPACE hard. Journal of the ACM (1980), 393^401. 50. LUBY, M. Pseudorandomness and Cryptographic Applications. Princeton Uni¬ versity Press, 1996. 51. Lund, C., FORTNOW, L., Karloff, H., AND NlSAN, N. Algebraic meth¬ ods for interactive proof systems. Journal of the ACM 39, 4 (1992), 859-868. 52. MILLER, G. L. Riemann’s hypothesis and tests for primality. Journal of Computer and System Sciences 13 (1976), 300-317. 53. NIVEN, I., AND Zuckerman, H. S. An Introduction to the Theory of Num¬ bers, 4th ed. John Wiley & Sons, 1980. 54. PAPADIMITRIOU, C. H. Computational Complexity. Addison-Wesley, 1994. 55. PAPADIMITRIOU, C. H., AND STEIGLITZ, K. Cotnbinatorial Optimization (Algorithms and Complexity). Prentice-Hall, 1982. 56. PAPADIMITRIOU, C. PI., AND YANNAKAKIS, M. Optimization, approxi¬ mation, and complexity classes. Journal of Computer and System Sciences 43, 3 (1991), 425-440. 57. POMERANCE, C. On the distribution of pseudoprimes. Mathematics of Com¬ putation 37, 156 (1981), 587-593. 58. PRATT, V. R. Every prime has a succinct certificate. SIAM Journal on Com- puting4, 3 (1975), 214-220. 59. RABIN, M. O. Probabilistic algorithms. In Algorithms and Complexity: New Directions and Recent Results, J. F. Traub, Ed. Academic Press, 1976, pp. 21-39. 60. REINGOLD, O. Undirected st-connectivity in log-space. (2004), http://www.eccc.uni-trier.de/eccc-reports/2004/TR04-094. 61. RlVEST, R. L., SHAMIR, A., AND Adleman, L. A method for obtaining digital signatures and public key cryptosytems. Cotnmunications of the ACM 21,2 (1978), 120-126. 62. ROCHE, E., AND Schabes, Y. Finite-State Language Processing. MIT Press, 1997. 63. SCHAEFER, T. J. On the complexity of some two-person perfect-infor¬ mation games. Journal of Computer and System Sciences 16,2 (1978), 185-225. 64. SEDGEWICK, R. Algorithms, 2d ed. Addison-Wesley, 1989. 65. Shamir, A. IP = PSPACE. Journal of the ACM 39, 4 (1992), 869-877. 66. SHEN, A. IP = PSPACE: Simplified proof. Journal of the ACM 39, 4 (1992), 878-880. 67. SHOR, P. W. Polynomial-time algorithms for prime factorization and dis¬ crete logarithms on a quantum computer. SIAM Journal on Computing 26, (1997), 1484-1509. 68. SlPSER, M. Lower bounds on the size of sweeping automata. Journal of Computer and System Sciences 21, 2 (1980), 195-202. 419 69. SlPSER, M. The history and status of the P versus NP question. In Proceed¬ ings of the Twenty-fourth Annual ACM Symposium, on the Theory of Computing (1992), pp. 603-618. 70. STINSON, D. R. Cryptography: Theory and Practice. CRC Press, 1995. 71 . TARJAN, R. E. Data structures and network algorithms, vol. 44 of CBMS-NSF Reg. Conf. Ser. Appl. Math. SIAM, 1983. 72. TURING, A. M. On computable numbers, with an application to the Entscheidungsproblem. In Proceedings, London Mathematical Society, (1936), pp. 230-265. 73. ULLMANj. D., Aho, A. V., AND HOPCROFT,J.E. The Design and Analysis of Computer Algorithms. Addison-Wesley, 1974. 74. VAN LEEUWEN, J., Ed. Handbook of Theoretical Computer Science A: Algo¬ rithms and Complexity. Elsevier, 1990. Index Symbols A f (natural numbers), 4, 227 7 Z (real numbers), 157, 176 1 2 + (nonnegative real numbers), 249 0 (empty set), 4 € (element), 3 £ (not element), 3 C (subset), 3 C (proper subset), 4 C (proper subset), 328 U (union operation), 4, 44 n (intersection operation), 4 x (Cartesian or cross product), 6 e (empty string), 13 w n (reverse of w), 14 -i (negation operation), 14 A (conjunction operation), 14 V (disjunction operation), 14 33 (exclusive OR operation), 15 —► (implication operation), 15 (equality operation), 15 <= (reverse implication), 18 => (implication), 18 (logical equivalence), 18 o (concatenation operation), 44 * (star operation), 44 + (plus operation), 65 V(Q) (power set), 53 £ (alphabet), 53 E e (EU{e}),53 (•) (encoding), 157, 259 u (blank), 140 < m (mapping reduction), 207 <t (Turing reduction), 233 <l (log space reduction), 324 <p (polynomial time reduction), 272 d{x) (minimal description), 236 Th(Ad) (theory of model), 226 K{x) (descriptive complexity), 236 V (universal quantifier), 310 3 (existential quantifier), 310 t (exponentiation), 343 0(/(n)) (big-O notation), 249-250 o(/(n)) (small-o notation), 250 421 422 INDEX Accept state, 34, 35 Acceptance problem for CFG, 170 for DFA, 166 for LB A, 194 for NFA, 167 forTM, 174 Accepting computation history, 193 Accepting configuration, 141 Accepts a language, meaning of, 36 Acfg, 170 Acyclic graph, 376 Adfa, 166 Adjacency matrix, 259 Adleman, Leonard M., 415, 418 Agrawal, Manindra, 415 Aho, Alfred V, 415, 419 Akl, Selim G., 415 Alba, 194 Algorithm complexity analysis, 248-253 decidability and undecidability, 165-182 defined, 154-156 describing, 156-159 Euclidean, 261 polynomial time, 256-263 running time, 248 ALLcfg, 197 Allen, Robin W., 416 Alon, Noga, 415 Alphabet, defined, 13 Alternating Turing machine, 381 Alternation, 380-386 Ambiguity, 105-106 Ambiguous NFA, 184 grammar, 105, 212 Amplification lemma, 369 AND operation, 14 Anfa> 167 Angluin, Dana, 415 Anti-clique, 27 Approximation algorithm, 365-367 Arexi 168 Argument, 8 Arithmetization, 394 Arity, 8, 225 Arora, Sanjeev, 415 ASPACE(/(n)), 382 Asymptotic analysis, 248 Asymptotic notation big-O notation, 249-250 small-o notation, 250 Asymptotic upper bound, 249 ATIME(f(ra)), 382 Atm, 174 Atomic formula, 225 Automata theory, 3, see also Context-free language; Regular language. Average-case analysis, 248 Baase, Sara, 415 Babai, Laszlo, 415 Bach, Eric, 415 Balcazar, Jose Luis, 416 Basis of induction, 23 Beame, Paul W., 416 Big-O notation, 248-250 Binary function, 8 Binary operation, 44 Binary relation, 8, 9 Bipartite graph, 332 Blank symbol u, 140 Blum, Manuel, 416 Boolean circuit, 351-359 depth, 400 gate, 352 size, 400 uniform family, 400 wire, 352 Boolean formula, 271, 310 minimal, 349 quantified, 311 Boolean logic, 14—15 Boolean matrix multiplication, 401 Boolean operation, 14, 225, 271 Boolean variable, 271 Bound variable, 310 Branching program, 376 read-once, 377 Brassard, Gilles, 416 Bratley, Paul, 416 Breadth-first search, 256 Brute-force search, 257, 260, 264, 270 Cantor, Georg, 174 Carmichael number, 372 Carmichael, R. D., 416 INDEX 423 Cartesian product, 6, 46 CD-ROM, 321 Certificate, 265 CFG, see Context-free grammar CFL, see Context-free language Chaitin, GregoryJ., 236 Chandra, Ashok, 416 Characteristic sequence, 178 Checkers, game of, 320 Chernoff bound, 370 Chess, game of, 320 Chinese remainder theorem, 373 Chomsky normal form, 106-109, 130, 170,263 Chomsky, Noam, 416 Church, Alonzo, 2, 155, 227 Church-Turing thesis, 155-156, 253 CIRCUIT-SAT, 358 Circuit-satisfiability problem, 358 CIRCUIT-VALUE, 404 Circular definition, 65 Clause, 274 Clique, 27, 268 CLIQUE, 268 Closed under, 45 Closure under complementation context-free languages, non-, 128 P, 294 regular languages, 85 Closure under concatenation context-free languages, 129 NP, 295 P, 294 regular languages, 47, 60 Closure under intersection context-free languages, non-, 128 regular languages, 46 Closure under star context-free languages, 129 NP, 295 P, 295 regular languages, 62 Closure under union context-free languages, 129 NP, 295 P, 294 regular languages, 45, 59 CNF-formula, 274 Co-Turing-recognizable language, 181 Cobham, Alan, 416 Coefficient, 155 Coin-flip step, 368 Complement operation, 4 Complexity class ASPACE(/(n)), 382 ATIME(f(ra)), 382 BPP, 369 coNL, 326 coNP, 269 EXPSPACE, 340 EXPTIME, 308 IP,389 L, 321 NC, 402 NL, 321 NP, 264-270 NPSPACE, 308 NSPACE(/(n)), 304 NTIME(/(n)), 267 P, 256-263,269-270 PH, 386 PSPACE, 308 RP, 375 SPACE(/(n)), 304 TIME(/(n)),251 ZPP, 412 Complexity theory, 2 ChurcFCTuring thesis, 155-156, 253 Composite number, 265, 371 Compositeness witness, 373 COMPOSITES, 265 Compressible string, 239 Computability theory, 2 decidability and undecidability, 165-182 recursion theorem, 217-224 reducibility, 187-211 Turing machines, 137-154 Computable function, 206 Computation history context-free languages, 197-198 defined, 192 linear bounded automata, 193-197 Post correspondence problem, 199-205 reducibility, 192-205 Computational model, 31 Computer virus, 222 424 INDEX Concatenation of strings, 14 Concatenation operation, 44, 47, 60-61 Configuration, 140, 141, 322 Conjunction operation, 14 Conjunctive normal form, 274 coNL, 326 Connected graph, 11, 157 coNP, 269 Context-free grammar ambiguous, 105,212 defined, 102 Context-free language decidability, 170-172 defined, 101 efficient decidability, 262-263 inherently ambiguous, 106 pumping lemma, 123-128 Cook, Stephen A., 271, 359, 402, 416 Cook-Levin theorem, 271-360 Cormen, Thomas, 416 Corollary, 17 Correspondence, 175 Countable set, 175 Counterexample, 18 Counting problem, 392 Cross product, 6 Cryptography, 405-411 Cut edge, 367 Cut, in a graph, 296, 367 Cycle, 11 Davis, Martin, 155 Decidability, see also Undecidability, context-free language, 170-172 of 4 cfg> 170 of A DFA, 166 of Arex, 168 of Ecfg, 171 of E<3 DFA , 1^9 regular language, 166-170 Decidable language, 142 Decider deterministic, 142 nondeterministic, 152 Decision problem, 366 Definition, 17 Degree of a node, 10 DeMorgan’s laws, example of proof, 20 Depth complexity, 400 Derivation, 100 leftmost, 106 Derives, 102 Descriptive complexity, 236 Deterministic computation, 47 Deterministic finite automaton acceptance problem, 166 defined, 35 emptiness testing, 168 minimization, 299 DFA, see Deterministic finite automaton Diagonalization method, 174—181 Diaz, Josep, 416 Difference hierarchy, 300 Digital signatures, 407 Directed graph, 12 Directed path, 12 Disjunction operation, 14 Distributive law, 15 Domain of a function, 7 Dynamic programming, 262 -Ecfg, 171 Ed fa, 168 Edge of a graph, 10 Edmonds, Jack, 416 Elba, 195 Element distinctness problem, 147 Element of a set, 3 Emptiness testing for CFG, 171 for DFA, 168 for LBA, 195 for TM, 189 Empty set, 4 Empty string, 13 Encoding, 157, 259 Enderton, Herbert B., 416 Enumerator, 152-153 EQcfg. 172 EQdfa> 169 ^Qrext’ 344 EQtm Turing-unrecognizability, 210 undecidability, 192 Equality operation, 15 Equivalence relation, 9 Equivalent machines, 54 Erdos, Paul, 415 Error probability, 369 Ejm, undecidability, 189 INDEX 425 Euclidean algorithm, 261 Even, Shimon, 416 EXCLUSIVE OR operation, IS Existential state, 381 Exponential bound, 250 Exponential, versus polynomial, 257 EXPSPACE, 340 EXPSPACE-completeness, 343-349 EXPTIME, 308 Factor of a number, 371 Feller, William, 416 Fermat test, 372 Fermat’s little theorem, 371 Feynman, Richard P., 416 Final state, 35 Finite automaton automatic door example, 32 computation of, 40 decidability, 166-170 defined, 35 designing, 41-^44 transition function, 3 5 two-dimensional, 213 two-headed, 212 Finite state machine, see Finite automaton. Finite state transducer, 87 Fixed point theorem, 223 Formal proof, 230 Formula, 225, 271 FORMULA-GAME, 314 Fortnow, Lance, 418 Free variable, 225 FST, see Finite state transducer Function, 7-9 argument, 8 binary, 8 computable, 206 domain, 7 one-to-one, 175 one-way, 408 onto, 7,175 polynomial time computable, 272 range, 7 space constructible, 336 time constructible, 340 transition, 3 5 unary, 8 Gabarro, Joaquim, 416 Gadget in a completeness proof, 283 Game, 313 Garey, Michael R., 416 Gate in a Boolean circuit, 352 Generalized geography, 316 Generalized nondeterministic finite automaton, 70-76 converting to a regular expression, 71 defined, 70, 73 Geography game, 315 GG (generalized geography), 317 Gill, John T., 416 GNFA, see Generalized nondeterministic finite automaton GO, game of, 320 Go-moku, game of, 330 Godel,Kurt, 2, 227,230,416 Goemans, Michel X., 416 Goldwasser, Shafi, 416,417 Graph acyclic, 376 coloring, 297 cycle in, 11 degree, 10 directed, 12 edge, 10 isomorphism problem, 295, 387 k -regular, 21 labeled, 10 node, 10 strongly connected, 12 sub-, 11 undirected, 10 vertex, 10 Greenlaw, Raymond, 417 Halting configuration, 141 Halting problem, 173-181 unsolvability of, 174 HALT jm , 188 Hamiltonian path problem, 264 exponential time algorithm, 264 NP-completeness of, 286-291 polynomial time verifier, 265 HAMPATH, 264, 286 Harary, Frank, 417 Hartmanis, Juris, 417 426 INDEX Hey, Anthony J. G., 416 Hierarchy theorem, 336-347 space, 337 time, 341 High-level description of a Turing machine, 157 Hilbert, David, 154, 417 Hofstadter, Douglas R., 417 Hoover, H. James, 416, 417 Hopcroft, John E., 415, 417, 419 Huang, Ming-Deh A., 415 iff, 18 Implementation description of a Turing machine, 157 Implication operation, 15 Incompleteness theorem, 230 Incompressible string, 239 Indegree of a node, 12 Independent set, 27 Induction basis, 23 proof by, 22-25 step, 23 Induction hypothesis, 23 Inductive definition, 65 Infinite set, 4 Infix notation, 8 Inherent ambiguity, 106 Inherently ambiguous context-free language, 106 Integers, 4 Interactive proof system, 387-399 Interpretation, 226 Intersection operation, 4 ISO, 387 Isomorphic graphs, 295 Johnson, David S., 416, 417 fc-ary function, 8 A:-ary relation, 8 fc-clique, 267 A-opti trial approximation algorithm, 367 fc-tuple, 6 Karloff, Howard, 418 Karp, Richard M., 417 Kayal, Neeraj, 415 Kolmogorov, Andrei N., 236 L, 321 Labeled graph, 10 Ladder, 330 Language co-Turing-recognizable, 181 context-free, 101 decidable, 142 defined, 14 of a grammar, 101 recursively enumerable, 142 regular, 40 Turing-decidable, 142 Turing-recognizable, 142 Turing-unrecognizable, 181 Lawler, Eugene L., 417 LBA, see Linear bounded automaton Leaf in a tree, 11 Leeuwen, Jan van, 419 Leftmost derivation, 106 Leighton, F. Thomson, 417 Leiserson, Charles E., 416 Lemma, 17 Lenstra, Jan Karel, 417 Leveled graph, 333 Levin, Leonid A., 271, 359, 417 Lewis, Harry, 417 Lexical analyzer, 66 Lexicographic ordering, 14 Li, Ming, 417 Lichtenstein, David, 418 Linear bounded automaton, 193-197 Linear time, 253 Lipton, Richard J., 417 LISP, 154 Literal, 274 Log space computable function, 324 Log space reduction, 324, 404 Log space transducer, 324 Luby, Michael, 418 Lund, Carsten, 415, 418 Majority function, 363 Many-one reducibility, 206 Mapping, 7 Mapping reducibility, 206-211 polynomial time, 272 Markov chain, 33 Match, 199 Matijasevic, Yuri, 155 MAX-CLIQUE, 300, 361 INDEX 427 MAX-CUT, 296 Maximization problem, 367 Member of a set, 3 Micali, Silvio, 416, 417 Miller, Gary L., 418 MIN-FORMULA , 383 Alinesweeper, 298 Minimal Boolean formula, 349 Minimal description, 236 Minimal formula, 383 Minimization of a DFA, 299 Minimization problem, 366 Minimum pumping length, 91 Model, 226 MODEXP, 295 Modulo operation, 8 Motwani, Rajeev, 415 Multiset, 4, 269 Multitape Turing machine, 148-150 Myhill-Nerode theorem, 91 NL, 321 NL-complete problem PATH, 322 N L-completeness defined, 324 Natural numbers, 4 NC, 402 Negation operation, 14 NFA, see Nondeterministic finite automaton Nim, game of, 331 Nisan, Noam, 418 Niven, Ivan, 418 Node of a graph, 10 degree, 10 indegree, 12 outdegree, 12 Nondeterministic computation, 47 Nondeterministic finite automaton, 47-58 computation by, 48 defined, 53 equivalence with deterministic finite automaton, 55 equivalence with regular expression, 66 Nondeterministic polynomial time, 266 Nondeterministic Turing machine, 150-152 space complexity of, 304 time complexity of, 255 NONISO, 387 NOT operation, 14 NP, 264-270 NP-complete problem 3SAT, 214, 359 CIRCUIT-SAT, 358 HAMPATH, 286 SUBSET-SUM, 292 3COLOR, 297 UHAMPATH, 291 VERTEX-COVER, 284 NP-completeness, 271-294 defined, 276 NP-hard, 298 NP-problem, 266 NP" 4 , 348 NPSPACE, 308 NSPACE(/(ra)), 304 NTIME(/(n)), 267 NTM, see Nondeterministic Turing machine o(f(n)) (small-o notation), 250 One-sided error, 375 One-time pad, 406 One-to-one function, 17 5 One-way function, 408 One-way permutation, 408 Onto function, 7, 175 Optimal solution, 366 Optimization problem, 365 OR operation, 14 Oracle, 232,348 Oracle tape, 348 Outdegree of a node, 12 P, 256-263,269-270 P-complete problem CIRCUIT-VALUE, 404 P-completeness, 404 P" 4 , 348 Pair, tuple, 6 Palindrome, 90, 128 Papadimitriou, Christos H., 417, 418 Parallel computation, 399—104 Parallel random access machine, 400 Parity function, 353 Parse tree, 100 428 INDEX Parser, 99 Pascal, 154 Path Hamiltonian, 264 in a graph, 11 simple, 11 PATH, 259, 322 PCP, see Post correspondence problem. PDA, see Pushdown automaton Perfect shuffle operation, 89, 131 PH, 386 Pigeonhole principle, 78, 79, 124 Pippenger, Nick, 402 Polynomial, 154 Polynomial bound, 250 Polynomial time algorithm, 256-263 computable function, 272 hierarchy, 386 verifier, 265 Polynomial verifiability, 265 Polynomial, versus exponential, 257 Polynomially equivalent models, 257 Pomerance, Carl, 415,418 Popping a symbol, 110 Post correspondence problem (PCP), 199-205 modified, 200 Power set, 6, 53 PRAM, 400 Pratt, Vaughan R., 418 Prefix notation, 8 Prefix of a string, 89 Prefix-free language, 184 Prenex normal form, 225, 310 Prime number, 265, 295, 371 Private-key cryptosystem, 407 Probabilistic algorithm, 368-380 Probabilistic function, 408 Probabilistic Turing machine, 368 Processor complexity, 400 Production, 100 Proof, 17 by construction, 21 by contradiction, 21-22 by induction, 22-25 finding, 17-20 necessity for, 77 Proper subset, 4, 328 Prover, 389 Pseudoprime, 372 PSPACE, 308 PSPACE-complete problem FORMULA-GAME, 314 GG, 317 TQBF, 311 PSPACE-completeness, 309-320 defined, 309 Public-key cryptosystem, 407 Pumping lemma for context-free languages, 123-128 for regular languages, 77-82 Pumping length, 77, 91, 123 Pushdown automaton, 109-122 context-free grammars, 115-122 defined, 111 examples, 112-114 schematic of, 110 Pushing a symbol, 110 Putnam, Hilary, 155 PUZZLE, 297, 330 Quantified Boolean formula, 311 Quantifier, 310 in a logical sentence, 225 Query node in a branching program, 376 Rabin, Michael O., 418 Rackoff, Charles, 417 Ramsey’s theorem, 27 Range of a function, 7 Read-once branching program, 377 Real number, 176 Recognizes a language, meaning of, 36, 40 Recursion theorem, 217-224 fixed-point version, 223 terminology for, 221 Recursive language, see Decidable language. Recursively enumerable, see Turing-recognizable. Recursively enumerable language, 142 Reducibility, 187-211 mapping, 206-211 polynomial time, 272 via computation histories, 192-205 INDEX 429 Reduction, 187, 207 mapping, 207 Reflexive relation, 9 Regular expression, 63-76 defined, 64 equivalence to finite automaton, 66-76 examples of, 65 Regular language, 31-82 closure under concatenation, 47, 60 closure under intersection, 46 closure under star, 62 closure under union, 45, 59 decidability, 166-170 defined, 40 Regular operation, 44 REGULAR™, 191 Reingold, Omer, 418 Rejecting computation history, 193 Rejecting configuration, 141 Relation, 8, 225 binary, 8 Relatively prime, 260 Relativization, 348-351 RELPR1ME, 261 Reverse of a string, 14 Rice’s theorem, 191, 213, 215, 242, 244 Rinooy Kan, A. H. G., 417 Rivest, Ronald L., 416, 418 Robinson, Julia, 155 Roche, Emmanuel, 418 Root in a tree, 11 of a polynomial, 155 Rule in a context-free grammar, 100, 102 Rumely, Robert S., 415 Ruzzo, Walter L., 417 SAT, 276, 308 #SAT, 392 Satisfiability problem, 271 Satisfiable formula, 271 Savitch’s theorem, 305-307 Saxena, Nitin, 415 Schabes, Yves, 418 Schaefer, Thomas J., 418 Scope, 310 Scope, of a quantifier, 225 Secret key, 405 Sedgewick, Robert, 418 Self-reference, 218 Sentence, 311 Sequence, 6 Sequential computer, 399 Set, 3 countable, 175 uncountable, 176 Sethi, Ravi, 415 Shallit, Jeffrey, 415 Shamir, Adi, 418 Shen, Alexander, 418 Shmoys, David B., 417 Shor, Peter W., 418 Shuffle operation, 89, 131 Simple path, 11 Sipser, Michael, 418, 419 Size complexity, 400 Small-o notation, 250 SPACE(/(n)), 304 Space complexity, 303-333 Space complexity class, 304 Space complexity of nondeterministic Turing machine, 304 Space constructible function, 336 Space hierarchy theorem, 337 Spencer, Joel H., 415 Stack, 109 Star operation, 44, 62-63, 295 Start configuration, 141 Start state, 34 Start variable, in a context-free grammar, 100, 102 State diagram finite automaton, 34 pushdown automaton, 112 Turing machine, 144 Stearns, Richard E., 417 Steiglitz, Kenneth, 418 Stinson, Douglas R., 419 String, 13 Strongly connected graph, 12, 332 Structure, 226 Subgraph, 11 Subset of a set, 3 SUBSET-SUM, 268, 292 Substitution rule, 100 Substring, 14 430 INDEX Sudan, Madhu, 415 Symmetric difference, 169 Symmetric relation, 9 Synchronizing sequence, 92 Szegedy, Mario, 415 Tableau, 355 Tarjan, Robert E., 419 Tautology, 382 Term, in a polynomial, 154 Terminal, 100 Terminal in a context-free grammar, 102 Th(X),226 Theorem, 17 Theory, of a model, 226 SCOLOR, 297 3SAT, 274, 359 Tic-tac-toe, game of, 329 TIME(/(n)), 251 Time complexity, 247-294 analysis of, 248-253 of nondeterministic Turing machine, 255 Time complexity class, 267 Time constructible function, 340 Time hierarchy theorem, 341 TM, see Turing machine TQBF, 311 Transducer finite state, 87 log space, 324 Transition, 34 Transition function, 3 5 Transitive closure, 401 Transitive relation, 9 Trapdoor function, 410 Tree, 11 leaf, 11 parse, 100 root, 11 Triangle in a graph, 295 Tuple, 6 Turing machine, 137-154 alternating, 381 comparison with finite automaton, 138 defined, 140 describing, 156-159 examples of, 142-147 marking tape symbols, 146 multitape, 148-150 nondeterministic, 150-152 oracle, 232, 348 schematic of, 138 universal, 174 Turing reducibility, 232-233 Turing, Alan M., 2, 137, 155, 419 Turing-decidable language, 142 Turing-recognizable language, 142 Turing-unrecognizable language, 181-182 EQ JM , 210 Two-dimensional finite automaton, 213 Two-headed finite automaton, 212 2DFA, see Two-headed finite automaton 2DIM-DFA, see Two-dimensional finite automaton 2SAT, 299 Ullman, Jeffrey D., 415, 417, 419 Unary alphabet, 52, 82, 212 function, 8 notation, 259, 295 operation, 44 Uncountable set, 176 Undecidability diagonalization method, 174-181 of Ajm, 174 of Elba, 195 of EQj M , 192 of E T m, 189 of HALT tm , 188 of REGULARjm, 191 ofEQ CFG , 172 of Post correspondence problem, 200 via computation histories, 192-205 Undirected graph, 10 Union operation, 4, 44, 45, 59-60 Unit rule, 107 Universal quantifier, 310 Universal state, 381 Universal Turing machine, 174 Universe, 225, 310 Useless state in PDA, 184 in TM, 211 Valiant, Leslie G., 415 Variable Boolean, 271 bound, 310 in a context-free grammar, 100, 102 start, 100, 102 Venn diagram, 4 Verifier, 265, 388 Vertex of a graph, 10 VERTEX-COINER, 284 Virus, 222 Vitanyi, Paul, 417 Wegman, Mark, 416 Well-formed formula, 225 Williamson, David P., 416 Window, in a tableau, 279 Winning strategy, 314 Wire in a Boolean circuit, 352 Worst-case analysis, 248 XOR operation, 15, 354 Yannakakis, Mihalis, 418 Yields for configurations, 141 for context-free grammars, 102 Zuckerman, Herbert S., 418 Introduction to tke Theory of COMPUTATION : > Second Edition This highly anticipated revision of Michael Sipser's popular text builds upon the strengths of the previous edition. It tells the fascinating story of the theory of computation—a subject with beautiful results and exciting unsolved questions at the crossroads of mathematics and computer science. Sipser's candid, crystal-clear style allows students at every level to understand and enjoy this field. His innovative "proof idea" sections reveal the intuition underpinning the formal proofs of theorems by explaining profound concepts in plain English. The new edition incorporates many improvements students and professors have suggested over the years and offers completely updated, classroom-tested problem sets with sample solutions at the end of each chapter. A tout the Author Michael Sipser has taught theoretical computer science and other mathematical subjects at the Massachusetts Institute of Technology for the past 25 years, where he is a Professor of Applied Mathematics and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL). Currently, he is the head of the Mathematics Department. He enjoys teaching and pondering the many mysteries of complexity theory. THOMSON COURSE TECHNOLOGY COURSE •com Thomson Course Technology is part of the Thomson Learning family of companies—dedicated to providing innovative approaches to lifelong learning. Thomson is learning. Visit Thomson Course Technology online at www.course.com 9 “780534 950972 For your lifelong learning needs, www.thomsonleaming.com ISBN 0-534-15017-3